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PCT/US97/19541 

IMPROVED ADENOVIRUS VECTORS 



FIELD OF THE INVENTION 

The invention relates to improved adenovirus vectors, and more specifically. 
5 adenovirus vectors useful for gene therapy. 

BACKGROUND 

Adenoviruses (Ad) are double-stranded DNA viruses. The genome of adenoviruses 
(-36 kb) is complex and contains over 50 open reading frames (ORFs), These ORFs are 

10 overlapping and genes encoding one protein are often embedded within genes coding for other 
Ad proteins. Expression of Ad genes is divided into an early and a late phase. Early genes 
are those transcribed prior to replication of the genome while late genes are transcribed after 
replication. The early genes comprise Ela. Elb. E2a, E2b. E3 and E4. The Ela gene 
products are involved in transcriptional regulation; the Elb gene products are involved in the 

15 shut-off of host cell functions and mRNA transport. E2a encodes the a DNA-binding protein 
(DBF): E2b encodes the viral DNA polymerase and preterminal protein (pTP). The E3 gene 
products are not essential for viral growth in cell culture. The E4 region encodes regulatory 
protein involved in transcriptional and post-transcriptional regulation of viral gene expression: 
a subset of the E4 proteins are essential for viral growth. The products of the late genes (e.g., 

20 Ll-5) are predominantly components of the virion as well as proteins involved in the 
assembly of virions. The VA genes produce VA RNAs which block the host cell from 
shutting down viral protein synthesis. 

Adenoviruses or Ad vectors have been exploited for the delivery of foreign genes to 
cells for a number of reasons including the fact that Ad vectors have been shown to be highly 

25 effective for the transfer of genes into a wide variety of tissues in vivo and the fact that Ad 
infects both dividing and non-dividing cells; a number of tissues which are targets for gene 
therapy comprise largely non-dividing cells. 

The current generation of Ad vectors suffer from a number of limitations which 
preclude their widespread clinical use including: 1) immune detection and elimination of cells 

30 infected with Ad vectors. 2) a limited carrying capacity (about 8.5 kb) for the insertion of 
foreign genes and regulatory elements, and 3) low-level expression of Ad genes in cells 
infected with recombinant Ad vectors (generally, the expression of Ad proteins is toxic to 
cells). 

- 1 - 
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The latter problem was thought to be solved by using vectors containing deletions in 
the El region of the Ad genome (El gene products are required for viral gene expression and 
replication). However, even with such vectors, low-level expression of Ad genes is observed. 
It is now thought that most mammalian cells contain El -like factors which can substitute for 
5 the missing Ad El proteins and permit expression of Ad genes remaining on the El deleted 
vectors. 

What is needed is an approach that overcomes the problem of low level expression of 
Ad genes. Such an approach needs to ensure that adenovirus vectors are safe and non- 
immunogenic. 

10 

SUMMARY OF THE INVENTION 

The present invention contemplates two approaches to improving adenovirus vectors. 
The first approach generally contemplates a recombinant plasmid, together with a helper 
adenovirus, in a packaging cell line. The helper adenovirus is rendered safe by utilization of 
15 loxP sequences. In the second approach, "damaged" adenoviruses are employed. While the 
"damaged" adenovirus is capable of self-propagation in a packaging cell line, it is not capable 
of expressing certain genes (e.g., the DNA polymerase gene and/or the adenovirus preterminal 
protein gene). 

In one embodiment of the first approach, the present invention contemplates a 
20 recombinant plasmid, comprising in operable combination: a) a plasmid backbone, 

comprising an origin of replication, an antibiotic resistance gene and a eukaryotic promoter 
element: b) the left and right inverted terminal repeats (ITRs) of adenovirus, said ITRs each 
having a 5* and a 3' end and arranged in a tail to tail orientation on said plasmid backbone: 
c) the adenovirus packaging sequence, said packaging sequence having a 5' and a 3' end and 
25 linked to one of said ITRs: and d) a first gene of interest operably linked to said promoter 
element. 

While it is not intended that the present invention be limited by the precise size of the 
plasmid. it is generally desirable that the recombinant plasmid have a total size of between 27 
and 40 kilobase pairs. It is preferred that the total size of the DNA packaged into an EAM 
30 derived from these recombinant plasmids is about the length of the wild-type adenovirus 

genome (-36 kb). It is well known in the art that DNA representing about 105% of the wild- 
type length may be packaged into a viral particle: thus the EAM derived from recombinant 
plasmid may contain DNA whose length exceeds by --105% the size of the wild-type genome. 
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The size of the recombinant plasmid may be adjusted using reporter genes and genes of 
mterest ha\ ing various sizes (including the use of different sizes of introns within these genes) 
as well as through the use of irrelevant or non-coding DNA fragment which act as "stuffer" 
fragments (e.g., portions of bacteriophage genomes). 

In one embodiment of the recombinant plasmid, said 5" end of said packaging 
sequence is linked to said 3" end of said left ITR. In this embodiment, said first gene of 
interest is linked to said 3' end of said packaging sequence. It is not intended that the present 
invention be limited by the nature of the gene of interest: a variety of genes (including both 
cDNA and genomic forms) are contemplated; any gene having therapeutic value may be 
inserted into the recombinant plasmids of the present invention. For example, the transfer of 
the adenosine deaminase (ADA) gene is useful for the treatment of ADA- patients; the 
transfer of the CFTR gene is useful for the treatment of cystitic fibrosis. A wide variety of 
diseases are known to be due to a defect in a single gene. The plasmids. vectors and EAMs 
of the present invention are useful for the transfer of a non-mutated form of a gene which is 
mutated in a patient thereby resulting in disease. The present invention is illustrated using 
recombinant plasmids capable of generating encapsidated adenovirus minichromosomes 
(EAMs) containing the dystrophin cDNA gene (the cDNA form of this gene is preferred due 
to the large size of this gene); the dystrophin gene is non-functional in muscular dystrophy 
(MD) patients. However, the present invention is not limited toward the use of the dystrophin 
gene for treatment of MD; the use of the utrophin (also called the dystrophin related protein) 
gene is also contemplated for gene therapy for the treatment of MD [Tinsley et ai (1993) 
Curr. Opin. Genet. Dev. 3:484 and (1992) Nature 360:591]; the utrophin gene protein has 
been reported to be capable of functionally substituting for the dystrophin gene [Tinsley and 
Davies (1993) Neuromusc. Disord. 3:539]. As the utrophin gene product is expressed in the 
muscle of muscular dystrophy patients, no immune response would be directed against the 
utrophin gene product expressed in cells of a host (including a human) containing the 
recombinant plasmids. Ad vectors or EAMs of the present invention. While the present 
invention is illustrated using plasmids containing the dystrophin gene, the plasmids. Ad 
vectors and EAMs of the present invention have broad application for the transfer of any gene 
whose gene product is missing or altered in activity in cells. 

Embodiments are contemplated wherein the recombinant plasmid further comprises a 
second gene of interest. In one embodiment, said second gene of interest is linked to said 3' 
end of said right ITR. In one embodiment, said second gene of interest is a reporter gene. A 
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variety of reporter genes are contemplated, including but not limited to E. coli P-galactosidase 
gene, the human placental alkaline phosphatase gene, the green fluorescent protein gene and 
the chloramphenicol acetyltransferase gene. 

As mentioned above, the first approach also involves the use of a helper adenovirus in 

5 combination with the recombinant plasmid. In one embodiment, the present invention 

contemplates a helper adenovirus comprising i) first and a second loxP sequences, and ii) the 
adenovirus packaging sequence, said packaging sequence having a 5" and a 3' end. It is 
preferred that said first loxP sequence is linked to the 5' end of said packaging sequence and 
said second loxP sequence is linked to said 3' end of said packaging sequence. In one 

10 embodiment, the helper virus comprises at least one adenovirus gene coding region. 

The present invention contemplates a mammalian cell line containing the above- 
described recombinant plasmid and the above-described helper virus. It is preferred that said 
cell line is a 293-derived cell line. Specifically, in one embodiment, the present invention 
contemplates a mammalian cell line, comprising: a) a recombinant plasmid. comprising, in 

15 operable combination: i) a plasmid backbone, comprising an origin of replication, an 

antibiotic resistance gene and a eukaryotic promoter element, ii) the left and right inverted 
terminal repeats (ITRs) of adenovirus, said ITRs each having a 5" and a 3' end and arranged 
in a tail to tail orientation on said plasmid backbone, iii) the adenovirus packaging sequence, 
said packaging sequence having a 5" and a 3' end and linked to one of said ITRs. and iv) a 

20 first gene of interest operably linked to said promoter element; and b) a helper adenovirus 
comprising i) first and a second loxP sequences, and ii) the adenovirus packaging sequence, 
said packaging sequence having a 5* and a 3' end. As noted previously, said helper can 
further comprise at least one adenovirus gene coding region. 

Overall, the first approach allows for a method of producing an adenovirus 

25 minichromosome. In one embodiment, this method comprises: A) providing a mammalian 
cell line containing: a) a recombinant plasmid. comprising, in operable combination, i) a 
plasmid backbone, comprising an origin of replication, an antibiotic resistance gene and a 
eukaryotic promoter element, ii) the left and right inverted terminal repeats (ITRs) of 
adenovirus, said ITRs each having a 5" and a 3' end and arranged in a tail to tail orientation 

30 on said plasmid backbone, iii) the adenovirus packaging sequence, said packaging sequence 
having a 5' and a 3* end and linked to one of the ITRs. and iv) a first gene of interest 
operably linked to said promoter element; and b) a helper adenovirus comprising i) first and a 
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second loxP sequences, ii) at least one adenovirus gene coding region, and iii) the adenovirus 
packaging sequence, said packaging sequence having a 5" and a 3" end: and 
B) growing said cell line under conditions such that said adenovirus gene coding region is 
expressed and said recombinant plasmid directs the production of at least one adenoviral 
5 minichromosome. It is desired that said adenovirus minicliromosome is encapsidated. 

In one embodiment, the present invention contemplates recovering said encapsidated 
adenovirus minichromosome and. in turn, purifying said recovered encapsidated adenovirus 
minichromosome. Thereafter, said purified encapsidated adenovirus minichromosome can be 
administered to a host {e.g., a mammal). Human therapy is thereby contemplated. 

10 It is not intended that the present invention be limited by the nature of the 

administration of said minichromosomes. All types of administration arc contemplated, 
including direct injection (intra-muscular. intravenous, subcutaneous, etc.). inhalation, etc. 

.\s noted above, the present invention contemplates a second approaches to improving 
adenovirus vectors. In the second approach, ''damaged" adenoviruses are employed. In one 

15 embodiment, the present invention contemplates a recombinant adenovirus comprising the 
adenovirus E2b region having a deletion, said adenovirus capable of self-propagation in a 
packaging cell line and said E2b region comprising the DNA polymerase gene and the 
adenovirus preterminal protein gene. In this embodiment, said deletion can be within the 
adenovirus DNA polymerase gene. Alternatively, said deletion is within the adenovirus 

20 preterminal protein gene. Finally, the present invention also contemplates embodiments 
wherein said deletion is within the adenovirus DNA polymerase and preterminal protein 
genes. 

The present invention further provides cell lines capable of supporting the propagation 
of Ad virus containing deletions within the E2b region. In one embodiment the invention 

25 provides a mammalian cell line stably and constitutively expressing the adenovirus El gene 
products and the adenovirus DNA polymerase. In one embodiment, these cell lines comprise 
a recombinant adenovirus comprising a deletion within the E2b region, this E2b-deieted 
recombinant adenovirus being capable of self-propagation in the ceil line. The present 
invention is not limited by the nature of the deletion within the E2b region. In one 

30 embodiment, the deletion is within the adenoviral DNA polymerase gene. 

The present invention provides cells lines stably expressing El proteins and the 
adenoviral DNA polymerase, wherein the genome of the cell line contains a nucleotide 
sequence encoding adenovirus DNA polymerase operably linked to a heterologous promoter. 
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In a panicularly preferred embodiment, the cell line is selected from the group consisting of 
the B-6. B-9. C-1. C-4. C-7. C-13. and C-14 cell lines. 

The present invention further provides cell lines which further constitutively express 
the adenovirus preterminal protein (pTP) gene product (in addition to El proteins and DNA 
polymerase). In one embodiment, these pTP-expressing cell lines comprise a recombinant 
adenovirus comprising a deletion within the E2b region, the recombinant adenovirus being 
capable of self-propagation in the pTP-expressing cell line. In a preferred embodiment, the 
deletion within the E2b region comprises a deletion within the adenoviral preterminal protein 
gene. In another preferred embodiment, the deletion within the E2b region comprises a 
deletion within the adenoviral (Ad) DNA polymerase and preterminal protein genes. 

In a preferred embodiment, the cell lines coexpressing pTP and Ad DNA polymerase, 
contain within their genome, a nucleotide sequence encoding adenovirus preterminal protein 
operably linked to a heterologous promoter. In the invention is not limited by the nature of 
the heterologous promoter chosen. The art knows well how to select a suitable heterologous 
promoter to achieve expression in the desired host cell (e.^.. 293 cells or derivative thereof). 
In a particularly preferred embodiment, the pTP- and Ad polymerase-expressing cell line is 
selected from the group consisting of the C-U C-4, C-7, C-13, and C-14 cell lines. 

The present invention provides a method of producing infectious recombinant 
adenovirus particles containing an adenoviral genome containing a deletion within the E2b 
region, comprising: a) providing: i) a mammalian cell line stably and constitutively 
expressing the adenovirus El gene products and the adenovirus DNA polymerase: ii) a 
recombinant adenovirus comprising a deletion within the E2b region, the recombinant 
adenovirus being capable of self-propagation in said cell line; h) introducing the recombinant 
adenovirus into the cell line under conditions such that the recombinant adenovirus is 
propagated to form infectious recombinant adenovirus particles; and c) recovering the 
infectious recombinant adenovirus particles. In a preferred embodiment, the method further 
comprises d) purifying the recovered infectious recombinant adenovirus particles. In yet 
another preferred embodiment, the method further comprises e) administering the purified 
recombinant adenovirus particles to a host which is preferably a mammal and most preferably 
a human. 

In another preferred embodiment the mammalian cell line employed in the above 
method further constitutively expresses the adenovirus preterminal protein. 
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The present invention further provides a recombinant piasmid capable of replicating in 
a bacterial host comprising adenoviral E2b sequences, the E2b sequences containing a deletion 
within the polymerase gene, the deletion resulting in reduced polymerase activity. The 
present invention is not limited by the specific deletion employed to reduce polymerase 
activity. In a preferred embodiment, the deletion comprises a deletion of nucleotides 8772 to 
9385 in SEQ ID N0:4. In one preferred embodiment, the recombinant piasmid has the 
designation pApoL In another preferred embodiment, the recombinant piasmid has the 
designation pBHGllApol. 

The present invention also provides a recombinant piasmid capable of replicating in a 
bacterial host comprising adenoviral E2b sequences, the E2b sequences containing a deletion 
within the preterminal protein gene, the deletion resulting in the inability to express functional 
preterminal protein without disruption of the VA RNA genes. The present invention is not 
limited b>' the specific deletion employed to render the pTP inactive: any deletion within the 
pTP coding region which does not disrupt the ability to express the Ad VA RNA genes may 
be employed. In a preferred embodiment, the deletion comprises a deletion of nucleotides 
10,705 to 11,134 in SEQ ID N0:4. In one preferred embodiment, the recombinant piasmid 
has the designation pApTP. In another preferred embodiment, the recombinant piasmid has 
the designation pBHGllApTP. 

In a preferred embodiment, the recombinant piasmid containing a deletion with the 
pTP region further comprises a deletion within the polymerase gene, this deletion resulting in 
reduced (preferably absent) polymerase activity. The present invention is not limited by the 
specific deletion employed to inactivate the polymerase and pTP genes. In a preferred 
embodiment, the deletion comprises a deletion of nucleotides 8.773 to 9586 and 11,067 to 
12.513 in SEQ ID N0:4. In one preferred embodiment, the recombinant piasmid has the 
designation pAXBApolApTPVARNA+tl3. In another preferred embodiment, the recombinant 
piasmid has the designation pBHGl 1 ApolApTPVARNA+tl3, 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 A is a schematic representation of the Ad polymerase expression piasmid 
pRSV-pol indicating that the Ad2 DNA polymerase sequences are under the transcriptional 
control of the RSV-LTR/promoter element and are flanked on the 3" end by the SV-40 small 
t intron and SV-40 polyadenylation addition site. 
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Fitiure IB is a schematic representaiion of the expression plasmid pRSV-pTP 
indicating that the Ad2 preterminal protem sequences are under the transcriptional control of 
the RSV-LTRypromoter element and are flanked on the 3" end by the SV-40 small-t intron 
and SV-40 polyadenylaiion signals. 

Figure 2 is an ethidium bromide-stained gel depicting the presence of Ad pol DNA 
sequences in genomic DNA from LP-293 cells and several hygromycin-resistant cell lines. 
The -750 bp PCR products are indicated by the arrow. 

Figure 3 is an autoradiograph depicting the results of a viral replication- 
complementation assay analyzing the funcuonal activity of the Ad polymerase protein 
expressed by LP-293 cells and several hygromycin resistant cell lines. The 8.010 bp HindlU 
fragments analyzed by densitometry are indicated by an arrow. 

Figure 4 is an autoradiograph indicating that cell lines B-6 and C-7 contained a 
smaller and a larger species of Ad polymerase mRNA while LP-293 derived RNA had no 
detectable hybridization signal. The location of the two species of Ad polymerase mRNA are 
indicated relative to the 28S and 18S ribosomal RNAs. and the aberrant transcript expressed 
by the B-9 cell line is indicated by an arrow. 

Figure 5 is an autoradiograph showing which of the cell lines that received the 
preterminal protein expression plasmid indicated the presence of pRSV-pTP sequences (arrow 
labelled "pTP") and El sequences (arrow labelled "El"). 

Figure 6 is an autoradiograph indicating in which of the cell lines transcription of 
preterminal protein is occurring (arrow labelled "- 3kb"). 

Figure 7 is an autoradiograph indicating that the expression of the Ad-polymerase 
could overcome the replication defect of HSsubm at non-permissive temperatures. 

Figure 8 graphically depicts plaque titre for LP-293, B-6 and C-7 cell lines infected 
with wtAdS. H5ts36, or H5.vw/3lOO, and the results demonstrate that the C-7 cell line can be 
used as a packaging cell line to allow the high level growth of El. preterminal, and 
polymerase deleted Ad vectors. 

Figure 9 is an autoradiograph showing that the recombinant pol" virus is viable on 
pol-expressing 293 cells but not on 293 cells which demonstrates that recombinant Ad viruses 
contaimng the 612 bp deletion found within pApol lack the ability to express Ad polymerase. 

Figure 10 is a schematic representation of the structure of pAdSfidys wherein the two 
inverted adenovirus origins of replication are represented by a left and right inverted terminal 
repeat (LITR/RITR). PI and P2 represent location of probes used for Southern blot analysis. 
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Fiiiure 1 1 graphically illustrates the total number of transducing adenovirus panicles 
produced (output) per serial passage on 293 cells, total input virus of either the helper (hpAP) 
or AdSpdys. and the total number of cells used in each infection. 

Fiaures 12A-B show Southern blot analyses of viral DNA from lysates 3. 6. 9 and 12. 
digested with the restriction enzymes BssHlL Nru\ and £coRV. For the analyses, fragments 
from the C terminus of mus musculus dystrophin cDNA (A) or the N terminus of E coli P- 
galactosidase (B) were labeled with dCTP^' and used as probes. 

Figures 13A-B show the physical separation of Ad5pdys from hpAP virions at the 
third (A) and final (B) stages of CsCl purification. 

Figure 14 graphically depicts the level of contamination of Ad5(Jdys EAMs by hpAP 
virions obtained from the final stage of CsCl purification as measured by p-galactosidase and 
alkaline phosphatase expression. The ratio of the two types of virions - AdSpdys EAMs 
(LacZ) or hpAP (AP) in each fraction is indicated in the lower graph. 

Figures 15A-B are western blot (immunoblot) analyses of protein extracts from mdx 
myoblasts and myotubes demonstrating the expression of p-galactosidase (A) and dystrophin 
(B) in cells infected with AdSpdys EAMs. 

Figures 16A-C depict immunofluorescence of dystrophin expression in wild type 
MM 14 myotubes (A), uninfected mdx (B) and infected mdx myotubes (C). 

Figure 17 is a schematic representation of the MCK/lacZ constructs tested to determine 
what portion of the ~ 3.3 kb DNA fragment containing the enhancer/promoter of the MCK 
gene is capable of directing high levels of expression of linked genes in muscle cells. 

Figure 18 is a schematic representation of a GFP/p-gal reporter construct suitable for 
assaying the expression of Cre recombinase in mammalian cells. 

Figure 19 is a schematic representation of the recombination event between the loxP 
shuttle vector and the Ad5t//7001 genome. 

DEFINITIONS 

To facilitate understanding of the invention, a number of terms are defined below. 

The term "gene" refers to a DNA sequence that comprises control and coding 
sequences necessary for the production of a polypeptide or precursor thereof. The polypeptide 
can be encoded by a full length coding sequence or by any portion of the coding sequence so 
long as the desired enzymatic activity is retained. The term "gene" encompasses both cDNA 
and genomic forms of a given gene. 
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The term "wild-type" refers to a gene or gene product which has the characteristics of 
that gene or gene product when isolated from a naturally occurring source. A wild-type gene 
is that which is most frequently observed in a population and is thus arbitrarily designated the 
"normal" or "wild-type" form of the gene. In contrast, the term "modified" or "mutant" refers 
to a gene or gene product which displays modifications in sequence and or functional 
properties {i.e.. altered characteristics) when compared to the wild-type gene or gene product. 
It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that 
they have altered characteristics when compared to the wild-type gene or gene product. 

The term "oligonucleotide" as used herein is defined as a molecule comprised of two 
or more deoxyribonucleotides or ribonucleotides, usually more than three (3), and typically 
more than ten (10) and up to one hundred (100) or more (although preferably between twenty 
and thirty). The exact size will depend on many factors, which in turn depends on the 
ultimate function or use of the oligonucleotide. The oligonucleotide may be generated in any 
manner, including chemical synthesis, DNA replication, reverse transcription, or a 
combination thereof 

As used herein, the term "regulatory element" refers to a genetic element which 
controls some aspect of the expression of nucleic acid sequences. For example, a promoter is 
a regulatory element which facilitates the initiation of transcription of an operably linked 
coding region. Other regulatory elements are splicing signals, polyadenylation signals, 
termination signals, etc. (defined infra). 

Transcriptional control signals in eucaryotes comprise "promoter" and "enhancer" 
elements. Promoters and enhancers consist of short arrays of DNA sequences that interact 
specifically with cellular proteins involved in transcription [Maniatis, T. et ciL. Science 
236:1237 (1987)], Promoter and enhancer elements have been isolated from a variety of 
eukaryotic sources including genes in yeast, insect and mammalian cells and viruses 
(analogous control elements, i.e., promoters, are also found in procaryotes). The selection of 
a particular promoter and enhancer depends on what cell type is to be used to express the 
protein of interest. Some eukaryotic promoters and enhancers have a broad host range while 
others are functional in a limited subset of cell types [for review see Voss, S.D. et ai. Trends 
Biochem. Sci.. 11:287 (1986) and Maniatis. T. et ai, supra {mi)]. 

The term "recombinant DNA vector" as used herein refers to DNA sequences 
containing a desired coding sequence and appropriate DNA sequences necessary for the 
expression of the operably linked coding sequence in a parficular host organism (e.g. 
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mammal). DNA sequences necessary for expression in procaryotes Include a promoter, 
opiionallv an operator sequence, a nbosome binding site and possibly other sequences. 
Eukarvotic cells are known to utilize promoters, polyadenlyation signals and enhancers. 

The terms "in operable combination", "in operable order" and "operably linked" as 
5 used herein refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid 
molecule capable of directing the transcription of a given gene and/or the synthesis of a 
desired protein molecule is produced. The term also refers to the linkage of amino acid 
sequences in such a manner so that a functional protein is produced. 

The term "genetic cassette" as used herein refers to a fragment or segment of DNA 
10 containing a particular grouping of genetic elements. The cassette can be removed and 

inserted into a vector or plasmid as a single unit. A plasmid backbone refers to a piece of 
DNA containing at least plasmid origin of replication and a selectable marker gene (e.g., an 
antibiotic resistance gene) which allows for selection of bacterial hosts containing the plasmid; 
the plasmid backbone may also include a polylinker region to facilitate the insertion of 
15 genetic elements within the plasmid. When a particular plasmid is modified to contain non- 
plasmid elements {e.g., insertion of Ad sequences and/or a eukarvotic gene of interest linked 
to a promoter element), the plasmid sequences are referred to as the plasmid backbone. 

Because mononucleotides are reacted to make oligonucleotides in a manner such that 
the 5* phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its 
20 neighbor in one direction via a phosphodiester linkage, an end of an oligonucleotide is 
referred to as the "5* end" if its 5" phosphate is not linked to the 3' oxygen of a 
mononucleotide pentose ring and as the "3' end" if its 3' oxygen is not linked to a 5' 
phosphate of a subsequent mononucleotide pentose ring. As used herein, a nucleic acid 
sequence, even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends. 
25 When two different, non-overlapping oligonucleotides anneal to different regions of 

the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide 
points towards the 5' end of the other, the former may be called the "upstream" 
oligonucleotide and the latter the "downstream" oligonucleotide. 

The term "primer" refers to an oligonucleotide which is capable of acting as a point of 
30 initiation of synthesis when placed under conditions in which primer extension is initiated. 

An oligonucleotide "primer" may occur naturally, as in a purified restriction digest or may be 
produced synthetically. 
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A primer is selected to be "substantially" complementary to a strand of specific 
sequence of the template. A primer must be sufficiently complementary to hybridize with a 
template strand for primer elongation to occur, A primer sequence need not reflect the exact 
sequence of the template. For example, a non-complementary nucleotide fragment may be 

5 attached to the 5" end of the primer, with the remainder of the primer sequence being 

substantially complementary to the strand. Non-complementary bases or longer sequences can 
be interspersed into the primer, provided that the primer sequence has sufficient 
complementarity with the sequence of the template to hybridize and thereby form a template 
primer complex for synthesis of the extension product of the primer. 

10 "Hybridization" methods involve the annealing of a complementary sequence to the 

target nucleic acid (the sequence to be detected). The ability of two polymers of nucleic acid 
containing complementary sequences to find each other and anneal through base pairing 
interaction is a well-recognized phenomenon. The initial observations of the "hybridization" 
process by Marmur and Lane, Proc. Natl. Acad. Sci USA 46:453 (1960) and Doty er ai. 

\5 Proc. Sail. Acad. Scr USA 46:461 (1960) have been followed by the refinement of this 
process into an essential tool of modem biology. 

The complement of a nucleic acid sequence as used herein refers to an oligonucleotide 
which, when aligned with the nucleic acid sequence such that the 5* end of one sequence is 
paired with the 3" end of the other, is in "antiparallel association." Certain bases not 

20 commonly found in natural nucleic acids may be included in the nucleic acids of the present 
invention and include, for example, inosine and 7-deazaguanine. Complementarity need not 
be perfect: stable duplexes may contain mismatched base pairs or unmatched bases. Those 
skilled in the art of nucleic acid technology can determine duplex stability empirically 
considering a number of variables including, for example, the length of the oligonucleotide. 

25 base composition and sequence of the oligonucleotide, ionic strength and incidence of 
mismatched base pairs. 

Stability of a nucleic acid duplex is measured by the melting temperature, or "T,^." 
The T,„ of a particular nucleic acid duplex under specified conditions is the temperature at 
which on average half of the base pairs have disassociated. The equation for calculating the 

30 T,^ of nucleic acids is well known in the art. 

The term "probe" as used herein refers to a labeled oligonucleotide which forms a 
duplex structure with a sequence in another nucleic acid, due to complementarity of at least 
one sequence in the probe with a sequence in the other nucleic acid. 
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The term "label" as used herein refers lo any atom or molecule which can be used lo 
provide a detectable (preferably quantifiable) signal, and which can be attached to a nucleic 
acid or protein. Labels may provide signals detectable by fluorescence, radioactivity, 
colorimeiry, gravimetry. X-ray ditlraction or absorption, magnetism, enzymatic activity, and 
5 the like. 

The terms "nucleic acid substrate" and nucleic acid template" are used herein 
interchangeably and refer to a nucleic acid molecule which may comprise single- or double- 
stranded DNA or RNA. 

"Oligonucleotide primers matching or complementary to a gene sequence" refers to 
10 oligonucleotide primers capable of facilitating the template-dependent synthesis of single or 
double-stranded nucleic acids. Oligonucleotide primers matching or complementary to a gene 
sequence may be used in PCRs. RT-PCRs and the like. 

A "consensus gene sequence" refers to a gene sequence which is derived by 
comparison of two or more gene sequences and which describes the nucleotides most often 
1 5 present in a given segment of the genes; the consensus sequence is the canonical sequence. 

The term "polymorphic locus" is a locus present in a population which shows variation 
between members of the population (i.e.. the most common allele has a frequency of less than 
0.95). in contrast, a "monomorphic locus" is a genetic locus at little or no variations seen 
between members of the population (generally taken to be a locus at which the most common 
20 allele exceeds a frequency of 0.95 in the gene pool of the population). 

The term "microorganism" as used herein means an organism too small to be observed 
with the unaided eye and includes, but is not limited to bacteria, viruses, protozoans, fungi, 
and ciliates. 

The term "microbial gene sequences" refers to gene sequences derived from a 
25 microorganism. 

The term "bacteria" refers to any bacterial species including eubacterial and 
archaebacterial species. 

The term "virus" refers to obligate, ultramicroscopic. intracellular parasites incapable 
of autonomous replication {Le,, replication requires the use of the host cell's machinery). 
30 Adenoviruses, as noted above, are double-stranded DNA viruses. The left and right inverted 
terminal repeats (ITRs) are short elements located at the 5' and 3' termini of the linear Ad 
genome, respectively and are required for replication of the viral DNA. The left ITR is 
located between 1-130 bp in the Ad genome (also referred to as 0-0.5 mu). The right ITR is 
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located from -3.7500 bp to the end of the genome (also reterred to as 99.5-100 mu). The two 
ITRs are inverted repeats of each other. For clarity, the left ITR or 5" end is sued define the 
5' and 3* ends of the ITRs. The 5' end of the left ITR is located at the extreme 5' end of the 
linear adenoviral genome: picturing the left ITR (LITR) as an arrow extending from the 5* 
5 end of the genome, the tail of the 5' ITR is located at mu 0 and the head of the left ITR is 

located at - 0.5 mu (further the tail of the left ITR is referred to as the 5' end of the left ITR 
and the head of the left ITR is referred to as the 3^ end of the left ITR). The tail of the right 
or 3" ITR is located at mu 100 and the head of the rigt ITR is located at mu 99.5: the head 
of the right ITR is referred to as the 5' end of the right ITR and the tail of the right ITR is 

10 referred to as the 3* end of the right ITR (RITR). in the linear Ad genome, the ITRs face 

each other with the head of each ITR pointing inward toward the bulk of the genome. When 
arranged in a "tail to tail orientation" the tails of each ITR (which comprise the 5' end of the 
LITR and the 3" end of the RITR) are located in proximity to one another while the heads of 
each ITR are separated and face outward (see for example, the arrangement of the ITRs in the 

15 EAM shown in Figure 10 herein). The "adenovirus packaging sequence" refers to the ^ 

sequence which comprises five (AI-AV) packaging signals and is required for encapsidation 
of the mature linear genome: the packaging signals are located from -194 to 358 bp in the Ad 
genome (about 0.5-1.0 mu). 

The phrase "at least one adenovirus gene coding region" refers to a nucleotide 

20 sequence containing more than one adenovirus gene coding gene. A "helper adenovirus" or 
"helper \ irus" refers to an adenovirus which is replication-competent in a particular host cell 
(the host may provide Ad gene products such as El proteins), this replication-competent virus 
is used to supply in trans functions {e.g.. proteins) which are lacking in a second replication- 
incompetent virus: the first replication-competent virus is said to "help" the second 

25 replication-incompetent virus thereby permitting the propagation of the second viral genome 
in the cell containing the helper and second viruses. 

The term "containing a deletion within the E2b region" refers to a deletion of at least 
one basepair (preferably more than one bp and preferably at least 100 and most preferably 
more than 300 bp) within the E2b region of the adenovirus genome. An E2b deletion is a 

30 deletion that prevents expression of at least one E2b gene product and encompasses deletions 
within exons encoding portions of E2b-speciftc proteins as well as deletions within promoter 
and leader sequences. 
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An "adenovirus minichromosome" refers to a linear molecule of DNA containing the 
Ad ITRs on each end which is generated from a plasmid containing the ITRs and one or more 
gene of interest. The term "encapsidated adenovirus minichromosome" or "EAM" refers to an 
adenovirus minichromosome which has been packaged or encapsidated into a viral particle: 
5 plasmids containing the Ad ITRs and the packaging signal are shown herein to produce 

HAMs, When used herein, "recovering" encapsidated adenovirus minichromosomes refers to 
the collection of EAMs from a cell containing an EAM plasmid and a helper virus: this cell 
will direct the encapsidation of the minichromosome to produce EAMs. The EAMs may be 
recovered from these cells by lysis of the cell (e.^., freeze-thawing) and pelleting of the cell 

10 debris to a cell extract as described in Example 1 (Ex. 1 describes the recovery of Ad virus 
from a cell, but the same technique is used to recover EAMs from a cell). "Purifying" such 
minichromosomes refers to the isolation of the recovered EAMs in a more concentrated form 
(relative to the cell lysale) on a density gradient as described in Example 7: purification of 
recovered EAMs permits the physical separation of the EAM from any helper virus (if 

15 present). 

The term "transfection" as used herein refers to the introduction of foreign DNA into 
eukaryotic cells. Transfecdon may be accomplished by a variety of means known to the art 
including calcium phosphate-DNA co-precipitation. DEAE-dextran-mediated transfection. 
polybrene-mediated transfection. electroporation, microinjecdon. liposome fusion, lipofection. 
20 protoplast fusion, retroviral infection, and biolistics. 

The term "stable transfection" or "stably transfected" refers to the introduction and 
integration of foreign DNA into the genome of the transfected cell. The term "stable 
transfectant" refers to a cell which has stably integrated foreign DNA into the genomic DNA. 

As used herein, the term "gene of interest" refers to a gene inserted into a vector or 
25 plasmid whose expression is desired in a host cell. Genes of interest include genes having 
therapeutic value as well as reporter genes. A variety of such genes are contemplated, 
including genes of interest encoding a protein which provides a therapeutic function (such as 
the dystrophin gene, which is capable of correcting the defect seen in the muscle of MD 
patients), the utrophin gene, the CFTR gene (capable of correcting the defect seen in cystitic 
30 fibrosis patients), etc.. 

The term "reporter gene ' indicates a gene sequence that encodes a reporter molecule 
(including an enzyme). A "reporter molecule" is detectable in any detection system, 
including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical 
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assays), tluorescent. radioactive, and luminescent systems. In one embodiment, the present 
invention contemplates the £, coli p-galactosidase gene (available from Pharmacia Biotech. 
Pistacataway. NJ), green fluorescent protein (GFP) (commercially available from Clontech. 
Palo Alto. CA). the human placental alkaline phosphatase gene, the chloramphenicol 
5 acetyltransferase (CAT) gene: other reporter genes are known to the art and may be 
employed. 

As used herein, the terms "nucleic acid molecule encoding," "DNA sequence 
encoding. ' and "DNA encoding" refer to the order or sequence of deoxyribonucleotides along 
a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the 
10 order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes 
for the amino acid sequence. 

DESCRIPTION OF THE INVENTION 

The present invention provides improved adenovirus vectors for the delivery of 

15 recombinant genes to cells in vitro and in vivo. As noted above, the present invention 

contemplates two approaches to improving adenovirus vectors. The first approach generally 
contemplates a recombinant plasmid containing the minimal region of the Ad genome 
required for replication and packaging [i.e.. the left and right ITR and the packaging or 4^ 
sequence) along with one or more genes of interest; this recombinant plasmid is packaged into 

20 an encapsidated adenovirus minichromosome (EAM) when grown in parallel with an El- 
deleted helper virus in a cell line expressing the El proteins (e.g.. 293 cells). The 
recombinant adenoviral minichromosome is preferentially packaged. To prevent the 
packaging of the helper virus, a helper virus containing loxP sequences flanking the T 
sequence is employed in conjunction with 293 cells expressing Cre recombinase: Cre-loxP 

25 mediated recombination removes the packaging sequence form the helper genome thereby 

preventing packaging of the helper during the production of EAMs. In the second approach, 
"damaged" or "deleted" adenoviruses containing deletions within the E2b region are 
employed. While the "damaged" adenovirus is capable of self-propagation in a packaging cell 
line expressing the appropriate E2b protein(s), the E2b-deleted recombinant adenovirus are 

30 incapable of replicating and expressing late viral gene products outside of the packaging cell 
line. 

In one embodiment, the self-propagating recombinant adenoviruses containing 
deletions in the E2b region of the adenovirus genome. In another embodiment, "gutted" 
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\ iruses are contemplated: these viruses lack all viral coding regions. In addition, packaging 

cell lines co-expressing El and E2b gene products are provided which allow the production of 
infectious recombinant virus containing deletions in the El and E2b regions without the use 
of helper virus. 

The Description of the Invention is divided into the following sections: E Self- 
Propagating Adenovirus Vectors: IE Packaging Cell Lines: and HE Encapsidated Adenoviral 
Minichromosomes. 



I. Self-Propagating Adenovirus Vectors 

10 Self-propagating adenovirus (Ad) vectors have been extensively utilized to deliver 

foreign genes to a great variety of cell types in vitro and in vivo. "Self-propagating viruses" 
are those which can be produced by transfection of a single piece of DNA (the recombinant 
viral genome) into a single packaging cell line to produce infectious virus: self-propagating 
viruses do not require the use of helper virus for propagation. 

15 Existing Ad vectors have been shown to be problematic in vivo. This is due in part 

because current or first generation Ad vectors are deleted for only the early region 1 (EI) 
genes. These vectors are crippled in their ability to replicate normally without the trans- 
complementation of El functions provided by human 293 cells, a packaging cell line [ATCC 
CRL 1573: Graham ef al. (1977) J. Gen. Virol. 36:59]. Unfortunately, with the use of high 

20 titres of El deleted vectors, and the fact that there are El -like factors present in many cell 
types. El deleted vectors can overcome the block to replication and express other viral gene 
products [Imperiale et al. (1984) Mol. Cell Biol. 4:867: Nevins (1981) Cell 26:213: and 
Gaynor and Berk (1983) Cell 33:683]. The expression of viral proteins in the infected target 
cells elicits a swift host immune response, that is largely T-cell mediated [Yang and Wilson 

25 (1995) .1. Immunol. 155:2564 and Yang et ai (1994) Proc. Natl. Acad. Sci. USA 91:4407]. 

The transduced cells are subsequently eliminated, along with the transferred foreign gene. In 
immuno-incompetent animals. Ad delivered genes can be expressed for periods of up to one 
year [Yang et ai (1994), supra; Vincent et al (1993) Nature Genetics 5:130: and Yang et ai 
(1995) Proc. Natl. Acad. Sci. USA 92:7257]. 

30 Another shortcoming of first generation Ad vectors is that a single recombination 

event between the genome of an Ad vector and the integrated El sequences present in 293 
cells can generate replication competent Ad (RCA), which can readily contaminate viral 
stocks. 
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In order to tunher cripple viral protein expression, and also to decrease the frequency 

of generating RCA. the present invention provides Ad vectors containing deletions in the E2b 
region. Propagation of these E2b-deleted Ad vectors requires cell lines which express the 
deleted E2b gene products. The present invention provides such packaging cell lines and for 
5 the first lime demonstrates that the E2b gene products, DNA polymerase and preterminal 

protein, can be constitutively expressed in 293 cells along with the El gene products. With 
every gene that can be constitutively expressed in 293 cells comes the opportunity to generate 
new versions of Ad vectors deleted for the respective genes. This has immediate benefits; 
increased carrying capacity, since the combined coding sequences of the polymerase and 

10 preterminal proteins that can be theoretically deleted approaches 4.6 kb and a decreased 

incidence of RCA generation, since two or more independent recombination events would be 
required to generate RCA. Therefore, the novel EL Ad polymerase and preterminal protein 
expressing cell lines of the present invention enable the propagation of Ad N ectors with a 
carrying capacity approaching 13 kb. without the need for a contaminating helper virus 

15 [Mitani et al. (1995) Proc. Natl. Acad. Sci. USA 92:3854]. In addhion, when genes critical 
to the viral life cycle are deleted {e.g., the E2b genes), a further crippling of Ad to replicate 
and express other viral gene proteins occurs. This decreases immune recognition of virally 
infected cells, and allows for extended durations of foreign gene expression. The most 
important attribute of EK polymerase, and preterminal protein deleted vectors, however, is 

20 their inabihty to express the respective proteins, as well as a predicted lack of expression of 
most of the viral structural proteins. For example, the major late promoter (MLP) of Ad is 
responsible for transcription of the late structural proteins LI through L5 [Doerfler, In 
Adenovirus DNA. The Viral Genome and Its Expression (Martinus Nijhoff Publishing Boston. 
1986)]. Though the MLP is minimally active prior to Ad genome replication, the rest of the 

25 late genes get transcribed and translated from the MLP only afier viral genome replication has 
occurred [Thomas and Mathews (1980) Cell 22:523]. This cis-dependent activation of late 
gene transcription is a feature of DNA viruses in general, such as in the growth of polyoma 
and SV-40. The polymerase and preterminal proteins are absolutely required for Ad 
replication (unlike the E4 or protein IX proteins) and thus their deletion is extremely 

30 detrimental to Ad vector late gene expression. 
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II. Packaging Cell Lines Constitutively Expressing E2b Gene 

Products 

1 he present invention addresses the hmitations of current or first generation Ad 
vectors by isolating novel 293 cell lines coexpressing critical viral gene functions. The 
5 present invention describes the isolation and characterization of 293 cell lines capable of 
constitutively expressing the Ad polymerase protein. In addition, the present invention 
describes the isolation of 293 cells which not only express the El and polymerase proteins, 
but also the Ad-preterminal protein. The isolation of cell lines coexpressing the EK Ad 
polymerase and preterminal proteins demonstrates that three genes critical to the life cycle of 

10 Ad can be constitutively coexpressed, without toxicity. 

In order to delete critical genes from self-propagating Ad vectors, the proteins encoded 
by the targeted genes have to first be coexpressed in 293 cells along with the El proteins. 
Therefore, only those proteins which are non-toxic when coexpressed constitutively (or toxic 
proteins inducibly-expressed) can be utilized. Coexpression in 293 cells of the El and E4 

15 genes has been demonstrated (utilizing inducible, not constitutive, promoters) [Yeh c( ai 

(1996) J. Virol. 70:559; Wang et ai (1995) Gene Therapy 2:775; and Gorziglia el ai (1996) 
J. Virol. 70:4173]. The El and protein IX genes (a virion structural protein) have been 
coexpressed [Caravokyri and Leppard (1995) J. Virol. 69:6627], and coexpression of the El, 
E4. and protein IX genes has also been described [Krougliak and Graham (1995) Hum. Gene 

20 Ther. 6:1575]. 

The present invention provides for the first time, cell lines coexpressing El and E2b 
gene products. The E2b region encodes the viral replication proteins which are absolutely 
required lor Ad genome replication [Doerfler. supra and Pronk et ai (1992) Chromosoma 
102:S39-S45]. The present invention provides 293 cells which constitutively express the 140 

25 kD Ad-polymerase. While other researchers have reported the isolation of 293 cells which 
express the Ad-preterminal protein utilizing an inducible promoter [Schaack et ai (1995) J. 
Virol. 69:4079]. the present invention is the first to demonstrate the high-level, constitutive 
coexpression of the El. polymerase, and preterminal proteins in 293 cells, without toxicity. 
These novel cell lines permit the propagation of novel Ad vectors deleted for the El, 

30 polymerase, and preterminal proteins. 
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III. Encapsidatcd Adenoviral Minichromosomes 

The present invention also provides encapsidated adenovirus minichromosome (EAM) 
consisiinL! of an infectious encapsidated linear genome containing Ad origins of replication, 
packaging signal elements, a reporter gene {e.^.. a p-galactosidase reporter gene cassette) and 
a gene of interest (e.g.. a full length (14 kb) dystrophin cDNA regulated by a muscle specific 
enhancer/promoter). EAMs are generated by cotransfecting 293 cells with supercoiled 
plasmid DNA {e.g., pAdSfJdys) containing an embedded inverted origin of replication (and the 
remaining above elements) together with linear DNA from El -deleted virions expressing 
human placental alkaline phosphatase (hpAP) (a helper virus). All proteins necessary for the 
generation of EAMs are provided in trans from the hpAP virions and the two can be 
separated from each other on equilibrium CsCl gradients. These EAMs are useful for gene 
transfer to a variety of cell types both in vitro and in vivo, 

Adenovirus-mediated gene transfer to muscle is a promising technology for gene 
therapy of Duchenne muscular dystrophy (DMD). However, currently available recombinant 
adenovirus vectors have several limitations, including a limited cloning capacity of -8.5 kb. 
and the induction of a host immune response that leads to transient gene expression of 3 to 4 
weeks in immunocompetent animals. Gene therapy for DMD could benefit from the 
development of adenoviral vectors with an increased cloning capacity to accommodate a full 
length (-14 kb) dystrophin cDNA. This increased capacity should also accommodate gene 
regulatory elements to achieve expression of transduced genes in a tissue-specific manner. 
Additional vector modifications that eliminate adenoviral genes, expression of which is 
associated with development of a host immune response, might greatly increase long term 
expression of virally delivered genes in vivo. The constructed encapsidated adenovirus 
minichromosomes of the present invention are capable of delivering up to 35 kb of non-viral 
exogenous DNA. These minichromosomes are derived from bacterial plasmids containing 
two fused inverted adenovirus origins of replication embedded in a circular genome, the 
adenovirus packaging signals, a pgalactosidase reporter gene and a full length dystrophin 
cDNA regulated by a muscle specific enhancer/promoter. The encapsidated minichromosomes 
are propagated in vitro by /m^.s-complementation with a replication defective (EKE3 deleted) 
helper virus. These minichromosomes can be propagated to high titer (>10Vml) and purified 
on CsCI gradients due to their buoyancy difference relative to helper virus. These vectors are 
able to transduce myogenic cell cultures and express dystrophin in myotubes. These resuhs 
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demonsiraie that encapsidated adenovirus minichromosomes are useful for gene transfer to 
muscle and other tissues. 

The present invention further provides methods for modifying the above-described 
EAM s> stem to enable the generation of high titer stocks of EAMs with minimal helper virus 
contamination. Preferably the EAM stocks contain helper virus representing less than 1%. 
preferably less than 0.1% and most preferably less than 0.01% (including 0.0%) of the final 
viral isolate. 

The amount of helper virus present in the EAM preparations is reduced in two ways. 
The first is by selectively controlling the relative packaging efficiency of the helper virus 
versus the EAM virus. The Cre-loxP excision method is employed to remove the packaging 
signals from the helper virus thereby preventing the packaging of the helper virus used to 
provide in trans viral proteins for the encapsidation of the recombinant adenovirus 
minichromosomes. The second approach to reducing or eliminating helper virus in EAM 
stocks is the use of improve physical methods for separating EAM from helper virus, 

EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments and aspects 
of the present invention and are not to be construed as limiting the scope thereof. 

In the experimental disclosure which follows, the following abbreviations apply: M 
(molar): mM (millimolar); ^M (micromolar); mol (moles); mmol (millimoles); ^mol 
(micromoles); nmol (nanomoles): mu or m.u. (map unit); g (gravity); gm (grams): rag 
(milligrams); ^g (micrograms): pg (picograms); L (liters): ml (milliliters); ^1 (microliters); cm 
(centimeters): mm (millimeters); jam (micrometers); nm (nanometers): hr (hour); min 
(minute): msec (millisecond): T (degrees Centigrade); AMP (adenosine 5 '-monophosphate); 
cDNA (copy or complimentary DNA); DTT (dithiotheritol); ddH.O (double distilled water); 
dNTP (deoxy ribonucleotide triphosphate); rNTP (ribonucleotide triphosphate); ddNTP 
(dideoxyribonucleotide triphosphate): bp (base pair); kb (kilo base pair); TLC (thin layer 
chromatography); tRNA (transfer RNA); nt (nucleotide); VRC (vanadyl ribonucleoside 
complex): RNase (ribonuclease); DNase (deoxyribonuclease); poly A (polyriboadenylic acid); 
PBS (phosphate buffered saline); OD (optical density); HEPES (N-[2- 
Hydroxyethyllpiperazine-N-[2-ethanesulfonic acid]): HBS (HEPES buffered saline); SDS 
(sodium dodecyl sulfate); Tris-HCl (tris[Hydroxymethyl]ammomethane-hydrochloride); rpm 
(revolutions per minute): ligation buffer (50 mM Tris-HCK 10 mM MgCK. 10 mM 
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dithiothreiiol. 25 f.Lg/ml bovine serum albumin, and 26 uM NAD-^. and pH 7.8): EGTA 
(ethylene Lilycol-bis{3-ammoethyl ether) N. N. N\ N*-tetraacetic acid): EDTA 
(ethylenediaminetetracetic acid): ELISA (enzyme linked immunosorbant assay): LB (Luria- 
Bertani broth: 10 g tryptone. 5 g yeast extract, and 10 g NaCl per liter, pH adjusted to 7.5 
5 with IN NaOH); superbroth (12 g tryptone, 24 g yeast extract. 5 g glycerol. 3.8 g KH.PO^ 

and 12.5 g. K.HPOj per liter); DMEM (Dulbecco's modified Eagle's medium): ABI (Applied 
Biosystems Inc.. Foster City. CA): Amersham (Amersham Corporation. Arlington Heights, 
IL): ATCC (American Type Culture Collection. Rockville, MY): Beckman (Beckman 
Instruments Inc.. Fullerton CA): BM (Boehringer Mannheim Biochemicals. Indianapolis. IN); 

10 Bio-101 (Bio- 101, Vista. CA); BioRad (BioRad, Richmond, CA): Brinkmann (Brinkmann 

Instruments Inc. Wesbury. NY); BRL, Gibco BRL and Life Technologies (Bethesda Research 
Laboratories. Life Technologies Inc.. Gaithersburg, MD): CRI (Collaborative Research Inc. 
Bedford. MA); Eastman Kodak (Eastman Kodak Co.. Rochester. NY): Eppendorf (Eppendorf. 
Eppendorf North America. Inc., Madison, WI): Falcon (Becton Dickenson Labware. Lincoln 

15 Park. NJ): IBI (International Biotechnologies. Inc., New Haven. CT): ICN (ICN Biomedicals, 
Inc., Costa Mesa, CA); Invitrogen (Invitrogen. San Diego, CA); New Brunswick (New 
Brunswick Scientific Co. Inc., Edison, NJ): NEB (New England BioLabs Inc.. Beverly. MA); 
NEN (Du Pont NEN Products, Boston. MA); Pharmacia (Pharmacia LKB Gaithersburg, MD); 
Promega (Promega Corporation, Madison, WI): Stratagene (Stratagene Cloning Systems, La 

20 Jolla. CA): UVP (UVP, Inc., San GabreiL CA); USB (United States Biochemical Corp., 
Cleveland. OH); and Whatman (Whatman Lab. Products Inc, Clifton. NJ). 

Unless otherwise indicated, all restriction enzymes and DNA modifying enzymes were 
obtained from New England Biolabs (NEB) and used according to the manufacturers 
directions. 

25 

EXAMPLE 1 

Generation Of Packaging Cell Lines That Coexpress 
The Adenovirus El And DNA Polymerase Proteins 

30 In this example, packaging cell lines coexpressing Ad El and polymerase proteins 

were described. These cell lines were shown to support the replication and growth of H5ts36, 
an Ad with a temperature-sensitive mutation of the Ad polymerase protein. These 
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polymerase-expressing packaging cell lines can be used to prepare Ad vectors deleted for the 
El and polymerase functions. 



a) Tissue Culture And Virus Growth 

LP-293 cells (Microbix Biosystems. Toronto) were grown and serially passaged as 
suggested by the supplier. 

Plaque assays were performed in 60 mm dishes containing cell monolayers at - 90% 
contluency. The appropriate virus dilution in a 2% DMEM solution was dripped onto the 
cells, and the plates incubated at the appropriate temperature for one hour. The virus 
containing media was aspirated, the monolayer was overlaid with 10 mis of a pre-warmed 
EMEM agar overlay solution (0.8% Noble agar, 4% fetal calf serum, and antibiotics) and 
allowed to solidify. After the appropriate incubation time (usually 7 days for incubations at 
38.5°C and 10-12 days for incubations at 32°C), five mis of the agar-containing solution 
containing 1.3% neutral red was overlaid onto the infected dishes and plaques were counted 
the next day. An aliquot of the virus H5ts36 [Freimuth and Ginsberg (1986) Proc. Natl. 
Acad. Sci. USA 83:7816] was utilized to produce high titre stocks after infection of 293 ceils 
at 32°C. H5ts36 is an Ad5-derived virus defective for viral replication at the nonpermissive 
temperature [Miller and Williams (1987) J. Virol, 61:3630]. 

The infected cells were harvested after the onset of extensive cytopathic effect, 
pelleted by centrifugation and resuspended in 10 mM Tris-Cl. pH 8.0. The lysate was freeze- 
thawed three times and centrifuged to remove the cell debris. The cleared lysate was applied 
to CsCK step gradients (heavy CsCl at density of 1.45 g/ml. the light CsCl at density of 1.20 
g/ml). ultracentrifuged. and purified using standard techniques [Graham and Prevec (1991) In 
Methods in Molecular Biology. Vol 7; Gene Transfer and Expression Protocols. Murray 
(ed.). Humana Press, Clifton, NJ, pp. 109-128]. The concentration of plaque forming units 
(pfu) of this stock was determined at 32°C as described above. Virion DNA was extracted 
from the high titre stock by pronase digestion, phenol-chloroform extraction, and ethanol 
precipitation. The leakiness of this stock was found to be <1 in 2000 pfu at the non- 
permissive temperature, consistent with previous reports [Miller and Williams (1987), supra], 

b) Plasmids 

The expression plasmid pRSV-pol [Zhao and Padmanabhan (1988) Cell 55:1005] 
contains sequences encoding the Ad2 polymerase mRNA (including the start codon from the 
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exon ai map unit 39) under the transcriptional control of the Rous Sarcoma Virus 
LTR/'promoter element: the Ad2 DNA polymerase sequences are tlanked on the 3' end by the 
SV-40 small t intron and SV-40 polyadenylation addition site (see Fig. 1 A). Figure lA 
provides a schematic representation of the Ad polymerase expression plasmid pRSV-pol. 
pRSV-pol includes the initiator methionine and amino-terminal peptides encoded by the exon 
at m.u. 39 of the Ad genome. The location of the PCR primers p602a and p2158c. the two 
Seal 1 kb probes utilized for Northern analyses, the polymerase terminator codon. and the 
polyadenylation site of the IVa2 gene (at 11.2 m.u. of the Ad genome) are indicated. 

The expression plasmid pRSV-pTP [Zhao and Padmanabhan (1988), supra] contains 
sequences encoding the Ad2 preterminal protein (including the amino terminal peptides 
encoded b> the exon at map unit 39 of the Ad genome) under the transcriptional control of 
the Rous Sarcoma Virus LTR/promoter element; the pTP sequences are flanked on the 3' end 
by the SV-40 small-l intron and SV-40 polyadenylation signals (see Figure IB for a schematic 
of pRSV-PTP). Figure IB also shows the location of the EcoKV subfragment utilized as a 
probe in the genomic DNA and cellular RNA evaluations, as well as the initiator methionine 
codon present from map unit 39 in the Ad5 genome. The locations of the H5/>7l90 and 
H5siih\00 insertions are shown relative to the preterminal protein open-reading frame. The 
following abbreviations are used in Figure 1: ORF. open reading frame: small-t. small tumor 
antigen, and m.u.. map unit. 

pCEP4 is a plasmid containing a hygromycin expression cassette (Invitrogen). 
pFGUO (Microbix Biosystems Inc. Toronto, Ontario) is a plasmid containing 
sequences derived from Ad5c//309 which contain a deletion/substitution in the E3 region 
[Jones and Shek (1979) Cell 17:683]. pFG140 is infectious in single transfections of 293 
cells and is used as a control for transfection efficiency. 



c) Transfection Of 293 Cells 

LP-293 cells were cotransfected with BamHl linearized pRSV-pol and BamHl 
linearized pCEP4 at a molar ratio of 10:1 using a standard CaP04 precipitation method 
[Sambrook ct ai (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor 
Press. Plainview. NY. pp. 16.33- 16.36]. In addition. 293 cells were cotransfected with or with 
BamU] linearized pRSV-poL BamHl linearized pRSV-pTP and pCEP using a molar ratio of 
10:1 (non-selectable:selectable plasmids). 
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Forty-eight hours after transfection. the cells were passaged into media containing 
hygronncin at 100 fag/mL. Individual hygromycin resistant colonies were isolated and 
expanded. 



5 d) Isolation Of Ad Polymerase Expressing 293 Cell Lines 

Twenty hygromycin resistant cell lines were expanded and screened for the ability to 
express the Ad polymerase protein. Initially, the individual cell lines were assayed for the 
ability to support growlh of the viral polymerase mutant H5ts36 using the plaque assay 
described in section a) above. It was speculated that constitutive expression of the wild type 

10 Ad polymerase protein in a clonal population of 293 cells should allow the growth of H5ts36 
at 38. 5°C. However, it was unclear if constitutive expression of the Ad polymerase would be 
toxic when coexpressed with the El proteins present in 293 cells. Similar toxicity problems 
have been observed with the Ad ssDBP. and the pTP [Klessig ct ciL (1984) Mol. Cell. Biol. 
4:1354 and Schaack et ai (1995) J. Virol. 69:4079]. 

15 Of the twenty hygromycin resistant cell lines isolated, seven were able to support 

plaque formation with H5ts36 at the non-permissive temperature, unlike the parental LP-293 
cells: these cell lines were named B-6, B-9. C-1, C-4, C-7. C-13 and C-14 (see Table 1) (the 
B-6 and B-9 cell lines received only the pRSV-pol and CEP4 plasmids; C-1. C-4. C-7, C-13 
and C-14 cell lines received the pRSV-pol, pRSV-pTP and CEP4 plasmids). For the results 

20 shown in Table 1. dishes (60 mm) of near confluent cells of each cell line were infected with 
the same dilution of H5ts36 at the temperature indicated, overlaid with agar media, and 
stained for plaques as outlined in section a. Passage number refers to the number of serial 
passages after initial transfection with the plasmid pRSV-pol. 

The cell line B-6 produced plaques one day earlier than the other Ad polymerase- 

25 expressing cell lines, which may reflect increased polymerase expression (see below). Cell 
line B-9 demonstrated an increased doubling time whereas each of the other cell lines 
displayed no growth disadvantages relative to the parental LP-293 cells. As shown in Table 
1. even after multiple passages (in some instances up to four months of serial passaging) the 
cells were still capable of H5ts36 plaque formation at the non-permissive temperature, 

30 indicating that the RSV-LTR/promoter remained active for extended periods of time. 

However, the cell lines B-9 and C-13 displayed a decreased ability to plaque the virus at 32°C 
as well as at 38.5°C. suggesting that a global viral complementation defect had occurred in 
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these cell lines after extended passaging. The remaining cell lines screened at later passages 
demonstrated no such defect, even after 20 passages {c.^.. cell line B-6- Table 1). 



TABLE 1 

Plaqumg Ability Of H5ts36 At The Non-Permissive Temperature Utilizing 293-Ad Polymerase Expressing 

Ceil Lines 







Number Of Plaques At: 


Cell Line 


Passage Number 


32,0"C 


38.5°C 


293 


- 


>500 


0 




9 


>500 


>500 


B-6 


20 


>500 


>500 


B-9 


5 


>500 


>500 


14 


90 


18 


C-1 


5 


>500 


>500 


13 


>500 


>500 




5 


>500 


>500 


C-4 


14 


>500 


>500 




5 


>500 


>500 


C-7 


14 


>500 


>500 




5 


>500 


>500 


C-13 


27 


120 


4 




5 


>500 


>500 


C-14 


27 


>500 


370 



c) Genomic Analysis Of Ad Polymerase-Expressing Cell Lines 

Genomic DNA from LP-293 cells and each of the seven cell lines able to complement 
H5ts36 at 38.5°C were analyzed by PGR for the presence of pRSV-pol derived sequences. 
Genomic DNA from LP-293 cells and the hygromycin resistant cell lines were harvested 
using standard protocols (Sambrook et ai, supra) and 200 ng of DNA from each cell line was 
analyzed by PGR in a solution containing 2 ng/mL of primers p602a and p2158c, 10 mM 
TrisHCl. pH 8.3. 50 mM KGl. 1.5 mM MgGL and 0.001% gelatin. The forward primer, 
p602a [5 -TTCATTTTATGTTTCAGGTTC AGGG-3* (SEQ ID N0:2)] is located in the SV- 
40 polyadenylation sequence. The reverse primer p2158c [5"- 

TTACCGGCAGAGTGGCAGGG-3^ (SEQ ID N0:3)] is Ad-sequence specific with the 5' 
nucleotide located at position 3394 of the Ad 5 genome [numbering according to Doerller 
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(1986) Adenovirus D\A. The Viral Genome and lis Expression. Nijhoff. Boston. MA. pp. 1- 

95]. 

PCR was performed with a Parkin Elmer 9600 Thermocycler utilizing the following 
cycling parameters: initial denaturation at 94°C for 3 min. 3 cycles of denaturaiion at 94''C 
for 30 sec. annealing at SO'^C for 30 sec. and extension at 72°C for 60 sec, followed by 
another 27 cycles with an increased annealing temperature at 56°C. with a final extension at 
72°C for 10 minutes. PCR products were separated on a 1.0 % agarose gel and visualized 
with ethidium bromide staining (Figure 2). A 1 kb ladder (Gibco-BRL) was used as a size 
marker, and the plasmid pRSV-pol was used as a positive control. The -750 bp PCR products 
are indicated by an arrow in Figure 2. 

As shown in Figure 2, all cell lines capable of H5ts36 plaque formation at 38.5°C 
contained the Ad pol DNA sequences, whereas the LP-293 cells did not yield any 
amplification product with these primers. This result demonstrates that each of the selected 
cell lines stably co-integrated not only the hygromycin resistance plasmid pCEP4, but also 
pRSV-pol. 

0 Complementation Of The Replication Defect Of H5ts36 By 
Ad Polymerase Expressing Cell Lines 

The C to T transition at position 7623 of the H5ts36 genome alters the DNA binding 
affinity of the Ad polymerase protein, rendering it defective for viral replication at non- 
permissive temperatures [Chen et ai (1994) Virology 205:364: Miller and Williams (1987) J. 
Virol. 61:3630; and Wilkie et ai (1973) Virology 51:499]. To analyze the functional activity 
of the Ad polymerase protein expressed by each of the packaging cell lines, a viral 
replication-complementation assay was performed, LP-293 cells or the hygromycin resistant 
cell lines were seeded onto 60 mM dishes at densities of 2.5-3.0 x lO*" per dish, infected with 
H5ts36 at a multiplicity of infection (MOI) of 10, and incubated for 24 hours at 38.5°C, or 48 
hours at 32°C, Total DNA was harvested from each plate, then 2 fig of each sample were 
digested with HindlW. separated on a 1.0% agarose gel, transferred to a nylon membrane, and 
hybridized with ''P-labeled H5ts36 virion DNA. Densitometric analysis of the 8.010 bp 
HindlW fragment in each lane was performed on a phophoroimager (Molecular Dynamics) 
utilizing a gel image processing system (IP Lab Version 1.5, Sunnyvale. CA). 
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The resulting auioradiograph is stiown in Figure 3. In Figure 3. the lane marked 
"Std." (standard) contains 1 of //mdlll-digesled H5ts36 virion DNA. The 8.010 bp 
HindlU fragments analyzed by densitometry are indicated by an arrow. 

As shown in Figure 3, H5ts36 had a diminished ability to replicate in LP-293 cells at 

3 the non-permissive temperature. In contrast, all seven of the previously selected cell lines 

were able to support replication of H5ts36 virion DNA at jS.S'^C to levels approaching those 
occurring in LP-293 cells at 32°C. 

A densiiometric analysis of the amount of H5ts36 viral DNA replicated in each of the 
cell lines at permissive and nonpermissive temperatures is presented in Table 2 below. For 

10 this assay, the relative amounts of the 8.010 bp Hindlll fragment were compared. The 

relative levels of H5ts36 virion DNA replication determined by densitometric analysis of the 
8.010 bp HindlU fragment isolated from each of the cell line DNA samples. The surface area 
of the 8.010 bp fragment in 293 cells incubated at 38.5°C was designated as 1. and includes 
some replicated H5ts36 virion DNA. The numbers in each column represent the ratio 

15 between the densities of the 8.010 bp fragment isolated in the indicated cell line and the 
density of the same band present in LP-293 cells at 38.5°C. 

As shown in Table 2. the levels of replication at the permissive temperature were all 
within four-fold of each other, regardless of which cell line was analyzed, but at the non- 
permissive temperature LP-293 cells reveal the H5ts36 replication defect. The viral bands 

20 that were present in the LP-293 DNA sample at 38.C represented input virion DNA as well as 
low level replication of H5ts36 DNA. which is generated due to the leakiness of the ts 
mutation at the high MOI utilized in this experiment. The Ad polymerase-expressing cell 
lines were all found to be capable of augmenting 
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TABLE 2 



Densitometric Analysis Of H5ts36 Replication 



3 



Cell Line 


Ratios Of 8,010 bp HindlW Fragment Generated At: 


32.0°C 


38.5°C 


LP-293 




1 .0 


B-6 


27.3 


57.2 


B-9 


69.0 


15.8 


C-1 


113.6 


34.1 


C-4 


100.2 


65.5 


C-7 


114.9 


75.0 


C-13 


43.1 


42.2 


C-14 


46.7 


27.8 



H5ts36 genome replication. Although one cell line (B-9) allowed H5ts36 replication to levels 
only 16 fold greater than LP-293 cells, this was the same cell line that was observed to 

1 5 displa\ poor growth properties. Each of the remaining Ad polymerase-expressing cell lines 
allowed substantially greater replication of H5ts36 at non-permissive temperatures, compared 
to LP-293 cells (Table 2). An enhancement of replication up to 75 fold above that of LP-293 
cells was observed with the cell line C-7 at 38,5°C. A substantially more rapid onset of viral 
cytopathic effect in Ad polymerase-expressing cell lines was observed at either temperature. 

20 These estimates of H5ts36 replication-complementation are conservative, since they have not 
been adjusted for the low level replication of H5ts36 at 38.5''C [Miller and Williams (1987), 
supra]. The leakiness of the H5ts36 mutation could potentially be overcome with the use of a 
virus deleted for the Ad polymerase gene. 

25 g) RNA Analysis Of Ad Polymerase-Expressing Cell Lines 

Total RNA was extracted from each of the cell lines using the RNAzol method 
(Teltesi. Inc.. Friendswood. TX 77546: Chomczynski and Sacchi (1987) Anal. Biochem. 
162:156]. Fifteen micrograms of RNA from each cell line was electrophoresed on a 0.8% 
agarose-formaldehyde gel. and transferred to a Nytran membrane (Schleicher & Schuell) by 
30 blotting. The filter was UV crosslinked, and analyzed by probing with the two "P-labeled 1 
kb Seal subfragments of Ad which span positions 6095-8105 of the Ad5 genome (see Figure 
lA). These two Seal subfragments of the Ad genome are complimentary to the 5' end of the 
Ad polymerase mRNA. The resulting autoradiograph is shown in Figure 4. In Figure 4. the 
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location of the smaller and larger species of Ad polymerase mRNA are indicated relative to 
the 28S and 18S ribosomal RNAs. The aberrant transcript expressed by the B-9 ceil line is 
indicated by an arrow. 

The results shown in Figure 4 revealed that the RNA derived from cell lines B-6 and 
5 C-7 contained two species of RNA. estimated to be -4800 and --7000 nt in length, while LP-293 
derived RNA had no detectable hybridization signal. The presence of two polymerase RNA 
species suggests that the polyadenylation signal of the Ad IVa2 gene (present in the Ad-pol 
construct, see Fig. lA) is being utilized by the cell RNA processing machinery, in addition to 
the SV-40 polyadenylation signal. Similar analysis of RNA derived from the cell lines C-1, 

10 C-4, C-1 3. and C-1 4 also detected the same two transcripts as those detected in the RNA of 
cell lines B-6 and C-7. but at decreased levels, suggesting that even low levels of Ad 
polymerase mRNA expression can allow for the efficient replication of polymerase mutants 
such as H5ts36. The cell line B-6 expressed high levels of polymerase transcript and can 
plaque H5ts36 one day earlier than the other cell lines at 38,5°C, suggesting a causal 

15 relationship. It is interesting to note that the two polymerase transcripts are also detected in 
RNA isolated from cell line B-9, but substantial amounts of a larger RNA transcript (size > 
10 kb) is also present (see Figure 4). The high level production of the aberrant message may 
be related to the increased doubling time previously noted in this cell line. 

20 h) Transfectability Of Ad Polymerase-Expressing Cell Lines 

The ability of Ad polymerase-expressing 293 ceil lines to support production of 
H5ts36 virions after transfection with H5ts36 genome DNA was examined as follows. 293 
cells as well as hygromycin resistant cell lines were grown to near confluency on 60 mm 
dishes and transfected with either 3 |ig of purified H5ts36 virion DNA, or with 3.5 jig of the 
25 plasmid pFGMO (Microbix Biosystems), using the cationic lipid Lipofectamine (Gibco-BRL). 
Cells that received the H5ts36 virion DNA were incubated at 32°C for 14 days, or 38.5°C for 
10 days. The pFG140 transfected cells were incubated at 37.5°C for 10 days. All plates 
were then stained with the neutral red agar overlay and plaques were counted the next day. 
The results are shown in Table 3. 
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TABLE 3 



Transfection Efficiency Of" Ad pol-Expressing Cell Lines 



10 



Cell line 


Number Of Plaques At: 


32.0X 




LP-293 


>500 


0 


B-6 


>500 


>500 


B-9 


n.d/ 


n.d. 


C-1 


n.d. 


>500 


C-4 


n.d. 


>500 


C-7 


n.d. 


>500 


C-13 


n.d. 


100 


C-14 


n.d. 


>500 



n.d. = not determined. 



15 The results shown in Table 3 demonstrated that transfection of H5ts36 DNA at the 

non-permissive temperature allows for ample plaque production in all of the Ad polymerase- 
expressing cell lines tested, unlike the parental LP-293 cells. Cell line C-13 was at passage 
number 29. and demonstrated a somewhat decreased ability to generate plaques at this 
extended passage number. These same cell lines are also capable of producing plaques when 

20 transfected with the plasmid pFGMO, a plasmid capable of producing infectious. El 

dependent Ad upon transfection of the parental 293 cells [Ghosh-Choudhury (1986) Gene 
50:161]. These observations suggest that the Ad polymerase expressing cell lines should be 
useful for the production of second generation Ad vectors deleted not only for the El genes, 
but also for the polymerase gene. As shown below in Examples 2 and 3, this is indeed the 

25 case. 

EXAMPLE 2 

Isolation And Characterization Of Packaging Cell Lines That 
Coexpress The Adenovirus El. DNA Polymerase And Preterminal Proteins 

30 

In Example 1, packaging cell lines coexpressing Ad EI and polymerase proteins were 
■ described. These cell lines were shown to support the replication and growth of H5ts36, an 
Ad with a temperature-sensitive mutation of the Ad polymerase protein. These polymerase- 
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expressing packaging cell lines can be used to prepare Ad vectors deleted for the El and 
polymerase functions. In this example. 293 ceils cotransfected with both Ad polymerase and 
preterminal protein expression plasmids are characterized. Cell lines co-expressing the Ad EK 
polymerase and preterminal proteins can be used to prepare Ad vectors deleted for the El. 
5 polymerase and preterminal protein (pTP) functions. 



a) Tissue Culture And Virus Propagation 

The use of LP-293 cells (Microbix Biosystems Inc., Toronto), Ad-polymerase 
expressing cell lines, and plaquing efficiency assays of Ad viruses was conducted as described 

10 in Example I. All cells were maintained in 10% fetal bovine serum supplemented DMEM 
media (GIBCO) in the presence of antibiotics. The virus HS^w^lOO [Freimuth and Ginsberg 
(1986) Proc. Natl. Acad. Sci. USA 83:7816] has a temperature sensitive (ts) mutation caused 
by a three base pair insertion within the amino terminus of the preterminal protein, in addition 
to a deletion of the El sequences (see Figure IB). HSsublOO was propagated and titred at 

15 32.0°C in LP-293 cells: the leakiness of this stock was less than 1 per 1000 plaque-forming 
units (pfu) at the nonpermissive temperature of 38.5°C, A lower titer cell lysate contaning 
the virus H5/>7l90 (which contains a 12 base-pair insertion within the carboxy-terminus of the 
preterminal protein as well as a deletion of the El region, see Figure IB) was provided by Dr. 
P. Freimuth [Freimuth and Ginsberg (1986), supra]. The polymerase and preterminal protein 

20 expressing cell lines were always maintained in media supplemented with hygromycin 
(Sigma) at 100 fig/mL. 



b) Isolation Of Ad Polymerase And Preterminal Protein 
Expressing 293 Cells 

25 The C-1, C-4, C-7. C-I3, and C-14 cell lines (Ex. 1), which had been cotransfected 

with pRSV-pol, pRSV-pTP and CEP4. were screened for presence of pTP sequences and for 
the ability to support the growth of H5ts36 (ts for the Ad-polymerase), H5/>7l90. and 
H5suh\00 using plaque assays as described in Example 1. 
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i) Analysis Of Genomic DNA And Cellular RNA 

Cell lines thai had received the preterminal protein expression plasmid were screen for 
the presence of pRSV-pTP sequences and El sequences. Total DNA was isolated from the 
LP-293. B-6 (transfected with pRSV-pol only) or C-7 (cotransfected with pRSV-pol and 
5 pRSV-pTP) cell lines, two micrograms ((ig) of each DNA was codigested with the restriction 
enzymes Xhal and BamHL electrophoretically separated in a 0.6% agarose gel. and 
transferred onto a nylon membrane. The membrane was UV crosslinked, probed with both a 
1.8 kb Blnl-Xhal fragment (spans the El coding region) isolated from the plasmid pFG140. 
and a 1.8kb EcoRV subfrsL^rnQni of Ad serotype 5 (spans the preterminal protein coding 
10 sequences; see Fig. IB), both of which were random-primer radiolabeled with '"P to a specific 
activity greater than 3.0 x 10^ cpm/^xg. The membrane was subsequently exposed to X-ray 
film with enhancement by a fluorescent screen. The resulting autoradiograph is shown in 
Figure 5. 

In Figure 5. the preterminal specific sequences migrated as an -1 1.0 kb DNA fragment 

15 while the El containing band migrated as a 2.3 kb DNA fragment. No hybridization of either 
probe to DNA isolated from either the LP-293 or B-6 cell lines was observed As shown in 
Figure 5. only the C-7 cell line genomic DNA had preterminal coding sequences, unlike the 
parental LP-293 cells, or the Ad-polymerase expressing B-6 cells. In addition, all cell lines 
had El specific sequences present at nearly equivalent amounts, demonstrating that the 

20 selection design has not caused the loss of the El sequences originally present in the LP-293 
cells. The results presented in Example 1 demonstrated that both the B-6 and C-7 cell lines 
contain polymerase specific sequences within their genomes, unlike the parental LP-293 cells. 

To confirm that transcription of preterminal protein was occurring, total RNA was 
isolated from each of the cell lines, transferred to nylon membranes, and probed to detect 

25 preterminal protein-specific mRNA transcripts as follows. Total cellular RNA was isolated 
from the respective cell lines and 15 i^g of total RNA from each cell line was transferred to 
nylon membranes. The membranes were probed with the 1,8 kb EcoRW radiolabeled 
subfragment of Ad5 (see Fig. IB) complementary to the preterminal protein coding region. 
The resulting autoradiograph is shown in Figure 6. 

30 As shown in Figure 6. a single mRNA of the expected size (-~3 kb in length) is detected 

only in RNA derived from the C-7 cell line. No hybridization was detected in lanes 
containing RNA derived from the LP-293 or B-6 cell lines. In Example 1. it was 
demonstrated that the C-7 cell line also expresses high levels of the Ad polymerase mRNA. 
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Thus, the C-7 cell line constituiively expresses both the Ad polymerase and preterminal 
protein mRNAs along with El transcripts. 



ii) Plaquing Efficiency Of pTP iVlutants On pTP- 
5 Expressing Cell Lines 

The C-7 cell line was screened for the ability to transcomplement the growth of 
preterminal mutant viruses. The virus H5/>7l90 (contains a 12 base pair insertion located 
within the carboxy- terminus of the preterminal protein) has been shown to have a severe 
growth and replication defect, producing less than 10 plaque-forming units per cell [Freimuth 

10 and Ginsberg (1986) Proc. Natl. Acad. Sci. USA 83:7816]. The results are summarized in 

Table 4 below. For the results shown in Table 4, LP-293, B-6. or C-7 cells were seeded at a 
density of 2.0-2.5 x 10^ cells per plate. The cells were infected with limiting dilutions of 
lysates derived from the preterminal protein-mutant viruses H5/>?190 or HSsublOO, incubated 
at 38.5°C. and plaques counted after six days. As shown in Table 4. only the C-7 cell line 

15 could allow efficient plaque formation of H5ml90 at 38.5°C (the H5ml90 lysate used to 

infect the cells was of a low liter, relative to the high tiler H5.vwfelOO stock), while both the B- 
6 and C-7 cell lines had nearly equivalent plaquing efficiencies when HSsublOO was utilized 
as the infecting virus. 

TABLE 4 

20 Plaquing Efficiency Of Preterminal Protein-Mutant Viruses 



Virus 


Mutation 
Location 


Plaque Titres (pfu/ml) 


LP-293 


8-6 


C-7 


H5mm 


carboxy-terminus 


<1 X 10- 


<1 X 10' 


1.4 X 10' 


H5.vi/^100 


amino-terminus 


<l X 10' 


9.0 X lO'* 


4.5 X 10* 



25 As shown in Table 4, when equivalent dilutions of H5/nl90 were utilized, the plaquing 

efficiency of the C-7 cell line was at least 100-fold greater than that of the B-6 or LP-293 
cells. This result demonstrated that the C-7 cell line produces a functional preterminal 
protein, capable of trans-complementing the defect of the HSmlQO derived preterminal 
protein. 

30 The cell lines were next screened for the ability to trans-complement with the 

temperature-sensitive virus, H5sublOO, at nonpermissive temperatures. HSsublOO has a codon 
insertion mutation within the amino-terminus of the preterminal protein, as well as an El 
deletion. The mutation is responsible both for a temperature sensitive growth defect, as well 
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as a replicaiion defect [Freimuth and Ginsberg (1986). supra and Schaack el al. (1995) J. 
Virol. 69:4079]. The plaquing efficiency of the cell line C-7 was found to be at least 1000 
fold greater than that of the LP-293 cells (at non-permissive temperatures) (see Table 4). 
Interestmgly. the cell hne B-6 was also capable of producing large numbers of H5.vi//^100 
derived plaques at 38.5°C, even though it does not express any preterminal protein. This 
result suggested that the high level expression of the polymerase protein was allowing plaque 
formation of H5i-w/?100 in the B-6 cell line. To examine this possibility, the nature of 
H5.vw/?100 growth in the various cell lines was examined. 



10 c) Complementation Of The Replication And Growth Defects 

Of H55W6100 

The cell lines B-6 and C-7 were shown to overcome the replication defect of H5ts36 
(Ex. 1). Since the preterminal and polymerase proteins are known to physically interact with 
each other [Zhao and Padmanabhan (1988) Cell 55:1005]. we investigated whether the 

15 expression of the Ad-polymerase could overcome the replication defect of H5.9wA100 at non- 
permissive temperatures using the following replication-complementation assay. 

LP-293. B-6, or C-7 cells were seeded onto 60 mM dishes at a density of 2x10^ cells 
per dish and infected the next day with WSsubXOQ at a multiplicity of infection (MOI) of 0.25, 
and incubated at 38.5°C for 16 hours or 32.0*^C for 40 hours. The cells from each infected 

20 plate were then harvested and total DNA extracted as described in Example 1 . Four 

micrograms of each DNA sample was digested with HinAWl. elecirophoresed through a 0.7% 
agarose gel. transferred to a nylon membrane, and probed with ^"P-labeled H5ts36 virion 
DNA. The resulting autoradiograph is shown in Figure 7. As seen in Figure 7. the 
\\5siib\QQ replication defect when grown in LP-293 cells at 38.5°C is seen; this defect is not 

25 present when the virus is grown at the same temperature in either B-6 or C-7 cells. 

The results depicted in Figure 7 demonstrates that both cell lines B-6 and C-7 could 
trans-complement the replication defect of HS^w^lOO. This result demonstrated that the 
expression of the Ad polymerase in B-6 cells was able to overcome the preterminal protein- 
mediated replication defect of H5.s*w/?100. While not limiting the present invention to any 

30 particular mechanism, the ability of Ad polymerase to overcome the preterminal protein- 
mediated replication defect of H5.su^lOO may be due to a direct physical interaction of the 
polymerase with the amino-terminus of the H5^w^lOO-derived preterminal protein. 
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In support of this hypothesis, it has also been demonstrated that the H5suh\00 
replication detect can be overcome when LP-293 cells were infected with a 100-fold greater 
amount of H5suh\00. However, complementation of the H5siih\00 replication defect is not 
sufficient to overcome the growih defect of HS^w/^lOO. since temperature shift-up experiments 
5 have demonstrated that the H5sub\00 growth defect is not dependent upon viral replication 
[Schaack et ai (1995). supra]. Therefore, the overexpression of the Ad-polymerase must 
have allowed a very low level but detectable production of infectious H5.vw/7lOO particles in 
the B-6 cell line. The reduced growth of W5sub\0Q is therefore not due to a replication 
defect, but rather some other critical activity that the preterminal protein has a role in, such as 
1 0 augmention of viral transcription by association with the nuclear matrix [Schaack and Shenk 
(1989) Curr. Top. Microbiol. Immunol. 144:185 and Hauser and Chamberlain (1996) J. Endo. 
149:373]. This was confirmed by assessing the ability of the C-7 cell line to overcome the 
grov^h defect of WSsubXW utilizing one-step growth assays performed as follows. 

Each of the cell lines (LP-293. B-6 and C-7) were seeded onto 60 mm dishes at 2.0 x 
15 10*^ cells/dish. The cell lines were infected at an MOI of 4 whh each of the appropriate 
viruses (wtAdS, H5ts36, or H5suh\0Q\ and incubated at 38.5°C for 40 hours. The total 
amount of infectious virions produced in each 60 mm dish was released from the cell lysates 
by three cycles of freeze-thawing, and the titer was then determined by limiting dilution and 
plaque assay on B-6 cells at 38.5°C. The results are summarized in Figure 8. 
20 The results shown in Figure 8, demonstrated that even though the B-6 cell line allowed 

normal replication and plaque formation of HS^-wfilOO at 38.5°C (in fact. B-6 cells were 
utilized to determine the plaque titres depicted in Figure 8) they could not allow high level 
growth of H5\w/?100 and only produced titres of WSsublQO equivalent to that produced by the 
LP-293 cells. The C-7 cell line produced 100 fold more virus than the LP-293 or B-6 cells, 
25 see Figure 8. Encouragingly, the titre of H5.ywftl00 produced by the C-7 cells approached 
titres produced by LP-293 cells infected with wild-type virus. Ad5. When the HS^wAlOO 
virions produced from infection of the C-7 cells were used to infect LP-293 cells at 38.5°C, 
all virus produced retained the ts mutation {i.e.. at least a 1000 fold drop in pfu was detected 
when LP-293 cells were respectively infected at 38.5°C vs. 32,0°C). This finding effectively 
30 rules out the theoretical possibility that the WSsubXQQ input virus genomes recombined with 
the preterminal protein sequences present in the C-7 cells. 

In addition, the C-7 cell line allowed the high level growth of H5ts36. demonstrating 
that adequate amounts of the Ad-polymerase protein were also being expressed. The C-7 cell 
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line was capable of trans-complementing the growth of both H5ts36 and H5sub\00 after 4 
months of serial passaging, demonstratmg that the coexpression of the El. preterminal, and 
polymerase proteins was not toxic. 

These results demonstrate that the constitutive expression of both the polymerase and 
5 preterminal proteins is not detrimental to normal virus production, which might have occurred 
if one or both of the proteins had to be expressed only during a narrow time period during the 
Ad life cycle. In summary, these results demonstrated that the C-7 cell line can be used as a 
packaging cell line to allow the high level growth of El, preterminal, and polymerase deleted 
Ad vectors. 

0 

EXAMPLE 3 

Production Of Adenovirus Vectors 
Deleted For El And Polymerase Functions 



15 In order to produce an Ad vectors deleted for El and polymerase functions, a small, 

frame-shifting deletion was introduced into the Ad-pol gene contained within an El -deleted 
Ad genome. The plasmid pBHGl 1 (Microbix) was used as the source of an El -deleted Ad 
genome. pBHGll contains a deletion of Ad5 sequences from bp 188 to bp 1339 (0.5-3.7 
m.u.): this deletion removes the packaging signals as well as El sequences. pBHGll also 

20 contains a large deletion within the E3 region (bp 27865 to bp 30995: 77.5-86.2 m.u.). The 
nucleotide sequence of pBHGll is listed in SEQ ID NO:4 [for cross-corrleation between the 
pBHGl 1 sequence and the Ad5 genome (SEQ ID N0:1), it is noted that nucleotide 8,773 in 
pBHGll is equivalent to nucleotide 7,269 in Ad5]. 

pBHGl 1 was chosen to provide the Ad backbone because this plasmid contains a large 

25 deletion within the E3 region (77.5 to 86.2 m.u.) and therefore vectors derived from this 
plasmid permit the insertion of large pieces of foreign DNA. A large cloning capacity is 
important when the pol" vectors is to be used to transfer a large gene such as the dystrophin 
gene (cDNA ^ 13.6 kb). However, the majority of genes are not this large and therefore 
other .Ad backbones containing smaller deletions within the E3 region (e.g., pBHGlO which 

30 contains a deletion between 78.3 to 85.8 m.u.; Microbix) may be employed for the 
construction of pol" vectors using the strategy outlined below. 
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a) Construction Of A Plasmid Containing A Portion Of The 

Adenovirus Genome Containing The Polymerase Gene 

A fragment of the Ad genome containing the pel gene located on pBHGl 1 was 
subcloned to create pBSA-XB. Due to the large size of the Ad genome, this intermediate 
5 plasmid was constructed to facilitate the introduction of a deletion within the pol gene the pol 
deletion. pBSA-XB was constructed as follows. The polylinker region of pBluescript 
(Stratagene) was modified to include addidonal restriction enzyme recognition sites (the 
sequence of the modified polylinker is provided in SEQ ID N0:5: the remainder of 
pBluescript was not altered); the resulting plasmid was termed pBSX. pBHGll was digested 
10 with XTial and BamHl and the 20.223 kb fragment containing the pol and pTP coding regions 
(E2b region) was inserted into pBSX digested with Xbal and BamUl to generate pBSA-XB. 

b) Construction Of pBHGllApol 

A deletion was introduced into the pol coding region contained within pBSA-XB in 
15 such a manner that other key viral elements were not disturbed {e.g., the major late promoter, 
the tripartite leader sequences, the pTP gene and other leader sequences critical for normal 
virus viability). The deletion of the pol sequences was carried out as follows. pBSA-XB was 
digested with DspEl and the ends were filled in using T4 DNA polymerase. The BspEl- 
digested. T4 polymerase filled DNA was then digested with BamHl and the 8809 bp 
20 BamHl/ BspEl{f\\led) fragment was isolated as follows. The treated DNA was run on a 0.6% 
agarose gel (TAE buffer) and the 8809 bp fragment was excised from the gel and purified 
using a QIAEX Gel Extraction Kit according to the manufacturer's instructions (OIAGEN, 
Chatsworth. CA). 

A second aliquot of pBSA-XB DNA was digested with As*/? HI and the ends were filled 
25 in with T4 DNA polymerase. The fi^pHI-digested. T4 polymerase filled DNA was then 
digested with BamHl and the 13.679 bp BamHl/ BspHl(m\Qd) fragment was isolated as 
described above. 

The purified 8809 bp BamHl/ BspEl(Mtd) fragment and the purified 13,679 bp 
BamHl/ BspH](f\l\ed) fragment were ligated to generate pApol. pApol contains a 612 bp 
30 deletion within the pol gene (bp 8772 to 9385; numbering relative to that of pBHGl 1) and 
lacks the 1 1.4 kb BamHl fragment containing the right arm of the Ad genome found within 
• pBHGll. 
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To provide the right arm of the Ad genome. pApol was digested with BamHl followed 
by treatment with calf intestinal alkaline phosphatase. pBHGl 1 was digested with BamHl and 
the 11.4 kb fragment was isolated and purified using a QIAEX Gel Extraction Kit as 
described above. The purified 1 1.4 kb BamHl fragment was ligated to the 
5 flawHl/phosphatased pApol to generate pBHGl 1 Apol. Proper construction of pBHGl 1 Apol 
was confirmed restriction digestion (Hindlll), 

c) Rescue And Propagation Of AdSApol Virus 

The Ad genome contained within pBHGllApol lacks the packaging signals. 
10 Therefore, in order to recover virus containing the 612 bp deletion within the pol gene from 
pBHGl lApol. this plasmid must be cotransfected into packaging cells along with DNA that 
provides a source of the Ad packaging signals. The Ad packaging signals may be provided 
by wild-type or mutant Ad viral DNA or alternatively may be provided using a shuttle vector 
which contains the left-end Ad5 sequences including the packaging signals such as 
15 pAElsplA. pAElsplB (Microbix) or pAdBglll (pAdBglll is a standard shuttle vector which 
contains 0-1 m.u. and 9-16 m.u, of the adenovirus genome). 

To rescue virus. pBHGllApol was co-transfected with Ad5c//7001 viral DNA . 
Ad5 J/7001 contains a deletion in the E3 region; the E3 deletion contained within AdSdllOOl 
is smaller than the deletion contained within the Ad genome contained within pBHGl 1. It is 
20 not necessary that Ad5c//7001 be used to recover virus; other adenoviruses, including wild- 
type adenoviruses, may be used to rescue of virus from pBHGl 1 Apol. 

It has been reported that the generation of recombinant Ads is more efficient if Ad 
DNA- terminal protein complex (TPC) is employed in conjunction with a plasmid containing 
the desired deletion [Miyake et ai (1996) Proc. Natl, Acad. Sci. USA 93:1320]. Accordingly, 
25 Ad5dl7001-TPC were prepared as described [Miyake et ai (1996). supra]. Briefly, purified 
Ad5c//7001 virions (purified through an isopycnic CsCl gradient centered at 1.34 g/ml) were 
lysed by the addition of an equal volume of 8 M guanidine hydrochloride. The released 
Ad5t//7001 DNA-TPC was then purified through a buoyant density gradient of 2.8 M CsCl/4 
M guanidine hydrochloride by centrifugation for 16 hr at 55.000 rpm in a VTi65 rotor 
30 (Beckman). Gradient fractions containing Ad56//7001 DNA-TPC were identified using an 

ethidium bromide spot test and then pooled, dialyzed extensively against TE buffer. BSA was 
then added to a final concentrauon of 0.5 mg/ml and aiiquots were stored at -80°C. The 
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Ad5 J/7001 DNA-TPC was then digested with Seal and then gel-filtered through a Sephadex 

G-50 spin column. 

One hundred nanograms of the digested AdSJ/VOOl DNA-TPC was mixed with 5 ^g 
of pBHGl 1 Apol and used to transfect pol-expressing 293 cells (i.e.. C-1). Approximately 10 
days post-transfcction. plaques were picked. The recombinant viruses were plaque purified 
and propagated using standard techniques (Graham and Prevac, siipro). Viral DNA was 
isolated, digested with restriction enzymes and subjected to Southern blotting analysis to 
determine the organization of the recovered viruses. Two forms of virus containing the 612 
bp pol deletion were recovered and termed Ad5ApolAE31 and Ad5ApolAE3IL One form of 
recombinant pol' virus recovered, Ad5ApolAE3L underwent a double recombination event 
with the Ad5c//7001 sequences and contains the E3 deletion contained within Ad5c//7001 at 
the right end of the genome. The second form of recombinant pol" virus, Ad5ApolAE3IL 
retained pBHGl 1 sequences at the right end of the genome {i.e., contained the E3 deletion 
found within pBHGl 1). These results demonstrate the production of a recombinant Ad vector 
containing a deletion within the pol gene. 

d) Characterization Of The £1+, pol- Viruses 

To demonstrate that the deletion contained within these two pol" viruses renders the 
virus incapable of producing functional polymerase. Ad5ApolAE3I was used to infect 293 
cells and pol-expressing 293 cells (B-6 and C-7 cell Unes) and a viral replication- 
complementation assay was performed as described in Example le. Briefly. 293, B-6 and C-7 
cells were seeded onto 60 mm dishes at a density of 2 x 106 cells/dish and infected with 
H5.vz//?100 or Ad5ApolAE3I at an MOI of 0.25. The infected cells were then incubated at 
37°C or 38.5°C for 16 hours, or at 32.0°C for 40 hours. Cells from each infected plate were 
then harvested and total DNA was extracted. Four micrograms of each DNA sample was 
digested with HindlU. electrophoresised through an agarose gel. transferred to a nylon 
membrane and probed with ^-P-labeled adenoviral DNA. The resuhing autograph is shown in 
Figure 9. In Figure 9, each panel shows, from left to right, DNA extracted from 293, B-6 
and C-7 cells, respectively infected with either HS.vwfclOO or Ad5ApolAE3I (labeled 

Ad5AP0L in Fig. 9). 

As shown in Figure 9. the recombinant pol" virus was found to be viable on pol- 
expressing 293 cells but not on 293 cells. These results demonstrates that recombinant Ad 
viruses containing the 612 bp deletion found within pApol lack the ability to express Ad 
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polymerase. These results also demonstrate that B-6 and C-7 cells efficiently complement the 
Apol found within Ad5ApolAE3I and Ad5ApolAE3II. in addition, these results show that 
replication of the ts pTP mutant HSsublOO can be complemented by high level expression of 
the Ad polymerase with or without co-expression of pTP. 

5 Because the expression of early genes is required for the expression of the late gene 

products, the ability of recombinant viruses contaning the Apol deletion to direct the 
expression of late gene products was examined. 293 and C-7 cells were infected with 
Ad5ApolAE3I and 24 hours after infection cell extracts were prepared. The cell extracts were 
serially diluted and examined for the expression of the fiber protein (a late gene product) by 

10 immunoblot analysis. The immunoblot was performed as described in Example 3C with the 
exception that the primary antibody used was an anti-fiber antibody (FIBER-KNOB obtained 
from Robert Gerard. University of Texas Southwestern Medical School). The results of this 
immunoblotting analysis revealed that no fiber protein was detected from 293 cells infected 
with Ad5ApolAE3I (at any dilutuion of the cell extract), hi contrast, even a 1:1000 dilution 

15 of cell extract prepared from C-7 cells infected with Ad5ApolAE3I produced a visible band 
on the immunoblot. Therefore, the pol deletion contained within Ad5ApolAE3I resulted in a 
greater than 1000-fold decrease in fiber production. 

The above results demonstrate that polymerase gene sequences can be deleted from the 
virus and that the resulting deleted virus will only grow on cells producing Ad-polymerase in 

20 (rans. Using the pol-expressing cell lines described herein {e.g.. B-6 and C-7). large 

quantities of the pol" viruses can be prepared. A dramatic shut-down in growth and late gene 
expression is seen when cells which do not express Ad polymerase are infected with the pel" 
viruses. 



25 e) Generation Of El", Pol" Ad Vectors 

Ad5c//7001 used above to recuse virus containing the polymerase deletion is an El - 
containing virus. The presence of El sequences on the recombinant pol" viruses is 
undesirable when the recombinant virus is to be used to transfer genes into the tissues of 
animals: the El region encodes the transforming genes and such viruses replicate extremely 
30 well in vivo leading to an immune response directed against cells infected with the El- 
containing virus. 

EP viruses containing the above-described polymerase deletion are generated as 
follows. pBHGll Apol is cotransfected into pol-expressing 293 cells (c^^^. B-6 or C-7) along 
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with a shuttle vector containing the left-end Ad5 sequences including the pacicaging signals. 
Suitable shuttle vectors include pAElsplA (Microbix). pAElsplB (Microbix) or pAdBglll. 
The gene of interest is inserted into the polylinker region of the shuttle vector and this 
plasmid is then cotransfected into B-6 or C-7 cells along with pBHGl 1 Apol to generate a 
5 recombinant EP. pol' Ad vector containing the gene of interest. 

EXAMPLE 4 

Production Of Adenovirus Vectors 
Deleted For EI And Preterminal Protein Functions 

10 

In order to produce an Ad vectors deleted for El and preterminal protein functions, a 
small deletion was introduced into the Ad preterminal protein (pTP) gene contained within an 
El -deleted Ad genome. The plasmid pBHGl 1 (Microbix) was used as the source of an EI- 
deleted Ad genome to maximize the cloning capacity of the resulting pTP" vector. However. 
15 other Ad backbones containing smaller deletions within the E3 region (e,g.. pBHGlO which 
contains a deletion between 78.3 to 85.8 m.u.; Microbix) may be employed for the 
construction of pTP" vectors using the strategy outlined below. 

a) Construction Of pApTP 

-0 A deletion was introduced into the pTP coding region contained within pBSA-XB (Ex. 

3) in such a manner that other key viral elements were not disturbed (e.g.. the tripartite leader 
sequences, the i-leader sequences, the VA-RNA I and II genes, the 55 kD gene and the pol 
gene). The deletion of the pTP sequences was carried out as follows. pBSA-XB was 
digested with Xbal and EcoRV and the 7.875 kb fragment was isolated as described (Ex. 3). 

25 Another aliquot of pBSA-XB was digested with Muni and the ends were filled in using T4 
DNA pol ymerase. The A<fw/7l-digested, T4 polymerase filled DNA was then digested with 
Xbal and the 14.894 kb ATjal/MwnKfilled) fragment was isolated as described (Ex. 3). The 
7.875 kb Miml fragment and the 14.894 kb XhallMunlitWkd) fragment were ligated together 
to generate pApTP. 

30 
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b) Construction Of pBHG 1 1 ApTP 

pApol contains a 429 bp deletion within the pTP gene (bp 10.705 to 1 L134: 
numbering relative to that of pBHGl 1) and lacks the 11.4 kb BamHl fragment containing the 
right arm of the Ad genome found withm pBHGll. 
5 To provide the right arm of the Ad genome. pApTP was digested with BamHl 

followed by treatment with shrimp alkaline phosphatase (SAP: U.S. Biochemicals. Cleveland, 
OH). pBHGl 1 was digested with BamHl and the 1 1.4 kb fragment was isolated and purified 
using a QIAEX Gel Extraction Kit as described (Ex. 3). The purified 1 1.4 kb BamHl 
fragment was ligated to the iS^^mHI/phosphatased pApTP to generate pBHGl 1 ApTP. Proper 
10 construction of pBHGI lApTP was confirmed restriction digestion. 

c) Rescue And Propagation Of Ad Vectors Containing The pTP 
Deletion 

The Ad genome contained within pBHGl lApTP lacks the Ad packaging signals. 

15 Therefore, in order to recover virus containing the 612 bp deletion within the pol gene from 
pBHGl 1 Apol. this plasmid must be cotransfected into packaging cells along with DNA that 
provides a source of the Ad packaging signals. The Ad packaging signals may be provided 
by wild-type or mutant Ad viral DNA or alternatively may be provided using a shuttle vector 
which contains the left-end Ad5 sequences including the packaging signals such as 

20 pAElsplA. pAElsplB (Microbix) or pAdBglll. 

Recombinant Ad vectors containing the pTP deletion which contain a deletion within 
the E3 region are generated by cotransfection of pBHGl lApTP (the gene of interest is 
inserted into the unique Pad site of pBHGl 1 ApTP) with a E3-deleted Ad virus such as 
Ad5dl700 into pTP-expressing 293 cells (e.g., C-7); viral DNA-TPC are utilized as described 

25 above in Example 3, 

Recombinant vectors containing the pTP deletion which also contain deletions within 
the El and E3 regions are generated by cotransfection of pBHGllApTP into pTP-expressing 
293 cells (e.g.. C-7) along with a shuttle vector containing the left-end Ad5 sequences 
including the packaging signals. Suitable shuttle vectors include pAElsplA (Microbix), 

30 pAElsplB (Microbix) or pAdBglll. The gene of interest is inserted into the polylinker region 
of the shuttle vector and this plasmid is then cotransfected into B-6 or C-7 cells along with 
pBHGl lApTP to generate a recombinant El", pTP" Ad vector containing the gene of 
interest. 
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EXAMPLE 5 

Production Of Adenovirus Vectors Deleted For 
El. Polymerase And Preterminal Protein Functions 

^ In to produce an Ad vectors deleted for El, polymerase and preterminal protein 

functions, a deletion encompassing pol and pTP gene sequences was introduced into the Ad 
sequences contained within an El -deleted Ad genome. The plasmid pBHGl 1AE4 was used 
as the source of an El -deleted Ad genome to maximize the cloning capacity of the resulting 
pol". pTP- vector. pBHGl I AE4 is a modified form of BHGl 1 which contains a deletion of 

10 all E4 genes except for the E4 ORF 6; the E4 region was deleted to create more room for the 
insertion of a gene of interest and to further disable the virus. However, other El' Ad 
backbones, such as pBHGl 1 and pBHGl (Microbix; pBHGlO contains a smaller deletion 
within the E3 region as compared to pBHGl I), may be employed for the construction of 
por. pTP~ vectors using the strategy outlined below. 

15 Due to the complexity of the cloning steps required to introduce a 2.3 kb deletion that 

removes portions of both the pTP and pol genes, this deletion was generated using several 
steps as detailed below. 

a) Construction Of pAXBApolApTPVARNA+tl3 

-0 In order to create a plasmid containing Ad sequences that have a deletion within the 

pol and pTP genes, pAXBApolApTPVARNA+tl3 was constructed as follows. 

pBSA-XB was digested with BspEl and the 1 8 kb fragment was isolated and 
recircularized to create pAXBApolApTP; this plasmid contains a deletion of the sequences 
contained between the BspEl sites located at 8.773 and 12,513 (numbering relative to 

25 pBHGll). 

A fragment encoding the VA-RNA3 sequence and the third leader of the tri-partite 
leader sequence was prepared using the PGR as follows. The PGR was carried out in a 
solution containing H5ts36 virion DNA (any Ad DNA, including wild-type Ad, may be used), 
2 ng/mL of primers 4005E and 4006E. 10 mM Tris HCL pH 8,3. 50 mM KCl. 1.5 mM 
30 MgCk 0.001% gelatin and Pfu polymerase. The forward primer. 4005E, [5"- 

TGGCGCAGCACGGGATGCATC-3^ (SEQ ID N0:6)] contains sequences complementary to 
residues 12.551 to 12,571 of pBHGll (SEQ ID N0:4), The reverse primer. 4006E, [5'- 
GCGTCCGGAGGCTGCCATG CGGCAGGG-3^ (SEQ ID N0:7)] is complementary to 
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residues 1 1.091 to 1 1.108 of pBHGll (SEQ ID N0:4) as well as a BspEl site (underlined). 
The predicted sequence of the ^1.6 kb PCR product is listed in SEQ ID N0:8. 

PCR was performed with a Perkin Elmer 9600 Thermocycler utilizing the following 
cycling parameters: initial denaturation at 94°C for 3 min. 3 cycles of denaturation at 94°C 
for 30 sec. annealing at 50°C for 30 sec. and extension at 72°C for 60 sec. followed by 
another 27 cycles with an increased annealing temperature at 56°C. with a final extension at 
72°C for 10 minutes. The --1.6 kb PCR product was purified using a QIAEX Gel Extraction 
Kit as described (Ex, 3). 

The purified PCR fragment was then digested with BspEL pAXBApolApTP was 
digested with BspEl followed by treatment with SAP. The &/?EI-digested PCR fragment and 
the 5.y/7EI-SAP-treated pAXBApolApTP were ligated together to create 
pAXB ApolApTP V ARN A+t 1 3 . 

b) Construction Of pBHGllApolApTPVARNA+tl3 

pAXBApolApTPVARNA4-tl3 contains a 2.3 kb deletion within the pol and pTP genes 
(bp 8.773 to 1 U091; numbering relative to that of pBHGl 1) and lacks the 11,4 kb BamHl 
fragment containing the right arm of the Ad genome. 

To provide the right arm of the Ad genome, pAXBApolApTPVARNA+tl3 was 
digested with BamHl followed by treatment with SAP. pBHGllAE4 (pBHGIl or pBHGlO 
may be used in place of pBHGl 1AE4) was digested with BamHl and the 1 1.4 kb fragment 
was isolated and purified using a QIAEX Gel Extraction Kit as described (Ex. 3). The 
purified 1 1.4 kb BamHl fragment was ligated to the 5a/nHI/phosphatased 
pAXBApolApTPVARNA+tl3 to generate pBHGl 1 ApolApTPVARNA+tl3. Proper 
construction of pBHGl 1 ApolApTPVARNA+tl3 was confirmed restriction digestion. 

c) Rescue And Propagation Of Ad Vectors Containing The pol, 
pTP Double Deletion 

The Ad genome contained within pBHGl 1 ApolApTPVARNA+tl3 lacks the Ad 
packaging signals. Therefore, in order to recover virus containing the 2.3 kb deletion within 
the pol and pTP genes from pBHGl 1 ApolApTPVARNA+tl3. this plasmid must be 
cotransfected into packaging cells along with DNA that provides a source of the Ad 
packaging signals. The Ad packaging signals may be provided by wild-type or mutant Ad 
viral DNA or alternatively may be provided using a shuule vector which contains the left-end 
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Ad5 sequences including the packaging signals such as pAElsplA. pAElsplB (Microbix) or 
pAdBgllL 

Recombinant vAd vectors containing the pol, pTP double deletion which also contain a 
deletion within the E3 region are generated by cotransfection of 
5 pBHGllApo[ApTPVARNA+tI3 (the gene of interest is inseaed into the unique Pad site of 
pBHGl lApolApTPVARNA+tl3) with a E3-deleted Ad virus such as Ad5t//7O01 into pol- and 
pTP-expressing 293 cells (e.g.. C-7); viral DNA-TPC are utilized as described (Ex. 3). 

Recombinant vectors containing the poL pTP double deletion which also contain 
deletions within the El and E3 regions are generated by cotransfection of 
10 pBHGllApolApTPVARNA+tl3 into pol- and pTP-expressing 293 cells (e.g.. C-7) along with 
a shuttle vector containing the left-end Ad5 sequences including the packaging signals. 
Suitable shuttle vectors include pAElsplA (Microbix), pAElsplB (Microbix) or pAdBglll. 
The gene of interest is inserted into the polylinker region of the shuttle vector and this 
plasmid is then cotransfected into B-6 or C-7 cells along with 
15 pBHGl lApolApTPVARNA+tl3 to generate a recombinant El", pol pTP" Ad vector 
containing the gene of interest. 

EXAMPLE 6 

Encapsidated Adenovirus Minichromosomes 
^0 Containing A Full Length Dystrophin cDNA 

In this Example, the construction of an encapsidated adenovirus minichromosome 
(EAM) consisting of an infectious encapsidated linear genome containing Ad origins of 
replication, packaging signal elements, a (3-galactosidase reporter gene cassette and a full 

25 length (14 kb) dystrophin cDNA regulated by a muscle specific enhancer/promoter is 

described. EAMs are generated by cotransfecting 293 cells with supercoiled plasmid DNA 
(pAdSpdys) containing an embedded inverted origin of replication (and the remaining above 
elements) together with linear DNA from El -deleted virions expressing human placental 
alkaline phosphatase (hpAP). All proteins necessary for the generation of EAMs are provided 

30 in trans from the hpAP virions and the two can be separated from each other on equilibrium 
CsCl gradients. These EAMs are useful for gene transfer to a variety of cell types both in 
vitro and in vivo. 
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a) Generation And Propagation Of Encapsidated Adenovirus 
iVlinichromosomes 

To establish a vector system capable of delivering full length dystrophin cDNA clones, 
the minimal region of Ad5 needed for replication and packaging was combined with a 
5 conventional plasmid carrying both dystrophin and a [i-galactosidase reporter gene. In the 
first vector constructed, these elements were arranged such that the viral ITRs tlanked (ITRs 
facing outward) the reporter gene and the dystrophin gene [i.e.. the vector contained from 5* 
to 3' the right or 3' ITR (mu 100 to 99), the dystrophin gene and the reporter gene and the 
left or 5* ITR and packaging sequnce (mu 1 to 0)]. Upon introduction of this vector along 
1 0 with helper virus into 293 cells, no encapsidated adenovirus minichromosomes were 
recovered. 

The second and successful vector, pAdSpdys (Fig. 10) contains 2.1 kb of adenovirus 
DNA. together with a 14 kb murine dystrophin cDNA under the control of the mouse muscle 
creatine kinase enhancer/promoter, as well as a p-galactosidase gene regulated by the human 

1 5 cytomegalovirus enhancer/promoter. 

Figure 10 shows the structure of pAdSpdys (27.8 kb). The two inverted adenovirus 
origins of replication are represented by a left and right inverted terminal repeat 
(LITR/RITR), Replication from these ITRs generates a linear genome whose termini 
correspond to the 0 and 100 map unit (mu) locations. Orientation of the origin with respect 

20 to wild type adenovirus serotype 5 sequences in mu is indicated above the figure (1 mu=360 
bp). Encapsidation of the mature linear genome is enabled by five (Al-AV) packaging signals 
(4^). The £ coli P-galactosidase and mus musculus dystrophin cDNAs are regulated by 
cytomegalovirus (CMV) and muscle creatine kinase (MCK) enhancer/promoter elements, 
respectively. Both expression cassettes contain the SV40 polyadenylation (pA) signal. Since 

25 the El A enhancer/promoter overlaps with the packaging signals, pAdSpdys was engineered 
such that RNA polymerase transcribing from the El A enhancer/promoter will encounter the 
SV40 late polyadenylation signal. Pertinent restriction sites used in constructing pAd5pdys 
are indicated below the figure. An adenovirus fragment corresponding to mu 6.97 to 7.77 
was introduced into pAdSpdys during the cloning procedure (described below). PI and P2 

30 represent location of probes used for Southern blot analysis. Restriction sites destroyed 
during the cloning of the Ad5 origin of replication and packaging signal are indicated in 
parentheses. 
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pAd5|3dys was constructed as follows. pBSX (Ex. 3a) was used as the backbone for 
construction ot" pAd5pdys. The inverted fused Ad5 origin of replication and five 
encapsidation signals were excised as a Pstl/Xhal fragment from pAd5ori. a plasmid 
containing the 6 kb HindUl fragment from pFGI40 (Microbix Biosystems). This strategy also 
> introduced a 290 bp fragment from Ad5 corresponding to map units 6.97 to 7.77 adjacent to 
the right inverted repeat (see Fig. 10). The E. colt p-galactosidase gene regulated by the 
human CMV immediate early (IE) promoter/enhancer expression cassette was derived as an 
EcoRVHindlU fragment from pCMVp [MacGregor and Caskey (1989) Nucleic Acids Res. 
17:2365]: the CMV IE enhancer/promoter is available from a number of suppliers as is the E, 

10 coli (^-galactosidase gene]. The murine dystrophin expression cassette was derived as a 
BssHll fragment from pCVAA, and contains a 3.3 kb MCK promoter/enhancer element 
[Phelps ef ul. (1995) Hum. Mol. Genet. 4:1251 and Jaynes ef ai (1986) Mol. Cell. Biol. 
6:2855]. The sequence of the -3.3 kb MCK promoter/enhancer element is provided in SEQ ID 
N0:9 (in SEQ ID NO:9, the last nucleotide of SEQ ID N0:9 corresponds to nucleotide +7 of 

15 the MCK gene). The enhancer element is contained within a 206 bp fragment located -1256 
and -1050 upstream of the transcription start site in the MCK gene. 

It was hoped that the pAdSpdys plasmid would be packageabie into an encapsidated 
minichromosome when grown in parallel with an El -deleted virus due to the inclusion of both 
inverted terminal repeals (ITR) and the major Ad packaging signals in the plasmid. The ITRs 

20 and packaging signals were derived from pFG140 (Microbix), a plasmid that generates El- 
defective Ad particles upon transfection of human 293 cells. 

hpAP is an El deleted Ad5 containing the human placental alkaline phosphatase gene 
[MuUer e( al. (1994) Circ. Res. 75:1039]. This virus was chosen to provide the helper 
functions so that it would be possible to monitor the titer of the helper virus throughout serial 

25 passages by quantitative alkaline phosphatase assays. 

293 cells were cotransfected with pAd5pdys and hpAP DNA as follows. Low passage 
293 cells (Microbix Biosystems) were grown and passaged as suggested by the supplier. Five 
pAd5pdys and hpAP DNA (5 and 0.5 |ag, respectively) were dissolved m 70 |il of 20 mM 
HEPES buffer (pH 7.4) and incubated with 30 ]a\ of DOTAP (BMB) for 15 min. at room 

30 temperature. This mixture was resuspended in 2 mis of DMEM supplemented with 2% fetal 
calf serum (FCS) and added dropwise to a 60 mm plate of 293 cells at 80% confluency. Four 
hours post-transfection the media was replaced by DMEM with 10% FCS. Cytopathic effect 
was observed 6-12 days post-transfection. 
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Cotransfection of 293 cells with supercoiled pAd5(idys and linear hpAP DNA 
produced Ad5pdys EAiMs (the encapsidated version of pAdi^dys) approximately 6 to 12 days 
post transfection. as evidenced by the appearance of a cytopathic effect (CPE). Initially, the 
amount of AdSpdys EAMs produced was significantly lower than that of hpAP virions so the 
5 viral suspension was used to re-infect fresh cultures, from which virus was isolated and used 
for serial infection of addiuonal cultures (Fig. 11). Infection and serial passaging of 293 cells 
was carried out as follows. 

Lysate from one 60 mm plate of transfected 293 cells was prepared by vigorously 
washing the cells from the plate and centrifuging at I K rpm in a clinical centrifuge. Cells 

10 were resuspended in DMEM and 2% FCS, freeze-thawed in a dry ice-ethanol bath, ceil debris 
removed by centrifugation, and approximately 75% of the crude lysate was used to infect 293 
cells in DMEM supplemented with 2% PCS for I hour and then supplemented with 10% PCS 
thereafter. Infection was allowed to proceed for 18-20 hrs before harvesting the virus. The 
total number of cells infected in each serial passage is indicated in Figure 1 1. 

15 In Figure 1 L the total number of transducing adenovirus particles produced (output) 

per serial passage on 293 cells, total input virus of either the helper (hpAP) or Ad5pdys, and 
the total number of cells used in each infection is presented. The total number of input/output 
transducing particles were determined by infection of 293 cells plated in 6-welI microtiter 
plates. Twenty four hours post-infection the cells were assayed for alkaline phosphatase or p- 

20 galactosidase activity (described below) to determine the number of cells transduced with 
either AdSpdys or hpAP. The number of transducing particles were estimated by 
extrapolation of the mean calculated from 31 randomly chosen 2.5 mm' sectors of a 961 mm* 
plate. The intra-sector differences in total output of each type of virus are presented as the 
standard deviations, a, in Figure II. For each serial passage, 75% of the total output virus 

25 from the previous passage was used for infection. 

Alkaline phosphatase or P-galactosidase activity was determined as follows. For 
detection of alkaline phosphatase, infected 293 cells on Petri dishes were rinsed twice with 
phosphate buffered saline (PBS) and fixed for 10 minutes in 0.5% glutaraldehyde in PBS. 
Cells were again rinsed twice with PBS for ten minutes followed by inactivation of 

30 endogenous alkaline phosphatase activity at 65°C for I hr. in PBS prior to the addition of the 
chromogenic substrate BCIP (5-bromo-4-chloro-3-indolyl phosphate) at 0.15 mg/ml and nitro 
blue tetrazolium at 0.3 mg/ml) . Cells were incubated at 37°C in darkness for 3- 24 hrs. For 
P-galactosidase assays, the cells were fixed and washed as above, then assayed using standard 
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methods [MacGregor el al. (1991) In Murray (ed.). Methods in Molecular Biology: Vol " 
Gene Transfer and Expression Protocols. Human Press Inc.. Clifton. NJ. pp. 217-225]. 

As shown in Figure 1 1. the rate of increase in titer of Ad5pd\ s EAMs between 
transfection and serial passage 6 was approximately 100 times greater than that for hpAP 
virions. This result indicated that AdSpdys has a replication advantage over the helper virus, 
probably due to the shorter genome length (a difference of approximately 8 kb) and hence an 
increased rate of packaging. 

Interestingly, after serial passage 6 there was a rapid decrease in the total titer of hpAP 
virions whereas the titer of AdSpdys EAMs continued to rise. While not limiting the present 
invention to any particular mechanism, at least two possible mechanisms could be responsible 
for this observation. Firstly, a buildup of defective hpAP virions due to infections at high 
multiplicities may slowly out-compete their full length counterparts, a phenomenon that has 
been previously observed upon serial propagation of adenovirus fDaniell (1976) J. Virol. 
19:685: Rosenwirth et al. (1974) Virology 60:431; and Burlingham et al. (1974) Virology 
60:419]. Secondly, the emergence of replication competent virions due to recombination 
events between El sequences in cellular DNA and the hpAP genome could lead to a buildup 
of virus particles defective in expressing alkaline phosphatase [Lochmuller et al. (1994) Hum. 
Gene Ther. 5:1485], 

Southern analysis of DNA prepared from serial lysates 3. 6. 9 and 12 indicated that 
full length dystrophin sequences were present in each of these lysates (Fig. 12A). In addition, 
the correct size restriction fragments were detected using both dystrophin and P-galactosidase 
probes against lysate DNA digested with several enzymes (Figs. 12A-B). 

Figure 12 shows a Southern blot analysis of viral DNA from lysates 3. 6. 9 and 12. 
digested with the restriction enzymes BssUW. Nru\ and EcoRV. indicating the presence of a 
full length dystrophin cDNA in all lysates. Fragments from the C terminus of mus musculus 
dystrophin cDNA (A) or the N terminus of E coli P-galactosidase (B) were labeled with 
dCTP^- and used as probes [Sambrook et al.. supra]. The position of these probes and the 
predicted fragments for each digest is indicated in Figure 10. Note that one end of each 
fragment (except the 17.8 kb BssHll dystrophin fragment) detected is derived from the end of 
the linearized Ad5pdys genome (see Fig. 10). Low levels of shorter products, presumably 
derived from defective virions, become detectable only at high serial passage number. 

At the later passages (9 and 12) there appeared to be an emergence of truncated 
AdSpdys sequences, suggesting that deletions and/or rearrangements may be occurring at later 
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passages. Hence, most experiments were performed with Ad5pdys EAMs derived from the 
earlier passages. The possibihty of the emergence of repUcation competent (El -containing) 
viruses was also examined by mfection of HeLa cells by purified and crude serial lysates. 
none of which produced any detectable CPE. 

5 

b) Purification Of Encapsidated Adenovirus Minichromosomes 
On CsCI Gradients 

The ability to separate AdSpdys EAMs from hpAP virions based on their buoyancy 
difference (due presumably to their different genome lengths) on CsCl gradients was 

10 examined. Repeated fractionation of the viral lysate allows small differences in buoyancy to 
be resolved. CsCl purification of encapsidated adenovirus minichromosomes was performed 
as follows. Approximately 25% of the lysate prepared from various passages during serial 
infections was used to purify virions. Freeze-thawed lysate was centrifuged to remove the 
cell debris. The cleared lysate was extracted twice with M,2 tricholorotrifluoroethane 

15 (Sigma) and applied to CsCl step and self forming gradients. 

Purification of virus was initially achieved by passing it twice through CsCl step 
gradients with densities of p=1.45 and p-L20 in a SW28 rotor (Beckman). After isolation of 
the major band in the lower gradient, the virus was passed through a self forming gradient 
(initial p= 1,334) at 37.000 rpm for 24 hrs followed by a relaxation of the gradient by 

20 reducing the speed to 10,000 rpm for 10 hrs. in a SW41 rotor (Becianan) at 12°C [Anet and 
Strayer (1969) Biochem. Biophys. Res. Commun. 34:328]. The upper band from the gradient 
(composed mainly of Ad5pdys virions) was isolated using an 18 gauge needle, reloaded on a 
fourth CsCl gradient (p=1.334) and purified at 37.000 rpm for 24 hrs followed by 10.000 rpm 
for 10 hrs at 12X. 

-5 The AdSpdys-containing CsCI band was removed in 100 |il fractions from the top of 

the centrifugation tube and CsCl was removed by chromatography on Sephadex G-50 
(Pharmacia). Aliquots from each fraction were used to infect 293 cells followed by P- 
galactosidase and alkaline phosphatase assays to quantitate the level of contamination by 
hpAP virions in the final viral isolate. 

^'0 Results of the physical separation between AdSpdys EAMs and hpAP virions are 

shown in Figure 13. Figure 13 shows the physical separation of AdSpdys from hpAP virions 
at the third (A) and final (B) stages of CsCl purification (initial p=1.334) in a SW41 tube. 
Aliquots of AdSpdys EAMs from the final stage were drawn through the top of the 
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cenlrifugation tube and assayed for p-galactosidase and alkaline phosphatase expression. 
Results of these assays are presented in Figure 14. 

In Figure 14. shows the level of contamination of AdSpdys EAMs by hpAP virions 
obtained from passage 6. Following four cycles of CsCl purification, aliquots were removed 
5 from the top of the cenlrifugation tube and used for infection of 293 cells, which were fixed 
24 hrs later and assayed for P-galactosidase or alkaline phosphatase activity (described below). 
The number of transducing particles are an underestimation of the actual totals as in some 
cases a positive cell may have been infected by more than one transducing particle. The ratio 
of the two types of virions - AdSpdys EAMs (LacZ) or hpAP (AP) in each fraction is 
10 indicated in the lower graph. 

The maximum ratio of transducing AdSpdys EAMs to hpAP virions reproducibly 
achieved in this study was 24.8 - a contamination with helper virus corresponding to 
approximately 4% of the final viral isolate. 

Dystrophin And /3-galactosidase Expression By Encapsidated 
Adenovirus Minichromosomes In Muscle Cells 

To determine if the AdSpdys EAMs were able to express P-galactosidase and 
dystrophin in muscle cells, mouse mcix myogenic cultures were infected with CsCl purified 



20 



EAMs. 



i) Propagation And Infection Of Muscle Cells 

MM 14 and mdx myogenic cell lines were kindly provided by S. Hauschka (University 
of Washington) and were cultured as previously described [Linkhart ci al. (1981) Dev. Biol. 
86:19 and Clegg et al. (1987) J. Cell Biol. 105:949], Briefly, myoblasts were grown on 
plastic tissue culture plates coated with a 0.1% gelatin in Ham's F-IO medium containing \5% 
(v/v) horse serum. 0.8 mM CaCK. 200 ng/ml recombinant human basic fibroblast growh 
factor (b-FGF) and 60 ^g/ml genitimicin (proliferation medium). Cutures were induced to 
differentiate by switching to growth in the presence of growth medium lacking b-FGF and 
containing 10% horse serum (differentiation medium). Myoblasts or differentiated myotubes 
(three days post switching) were infected at a multiplicity of infection of 2.2 AdSpdys EAMs 
per cell. Fractions containing minimal contamination with hpAP virions (3. 4 and 5 of 
passage 6) were used for western and immunofluorescence analysis. Infection was allowed to 
proceed for 3 days for both the myoblasts and myotubes before harvesting cells. 



'0 



- 52 - 



wo 98/17783 PCT/US97/19541 
ii) Total Protein Extraction And Immunoblot 
Analysis 

For protein extraction, muscle cells were brietly trypsinized. transferred to a 
microcentrifuge tube, centrifuged at 14 K for 3 min at room temp and resuspended two times 
m PBS. After an additional centrifugation. the cell pellet was resuspended in 80 |il of RIPA 
buffer {50 mM Tris-Cl, pH 7.5; 150 mM NaCl; 1% Nonidei P-40: 0.5% sodium 
deoxycholate: 0.1% SDS) (Sambrook et al., supra). The sample was briefly sheared using a 
22 gauge needle to reduce viscosity and total protein concentration assayed using the 
bicinconinic acid protein assay reagent (Pierce, Rockford, IL). Expression of full length 
dystrophin or P-galactosidase in infected mdx and MM 14 myoblasts or myotubes was 
analyzed by electrophoresis of 40 )ig of total protein extract on a 6% SDS-PAGE gel (in 25 
mM Tris. 192 mM glycme. 10 mM p-mercaptoethanol. 0.1% SDS). After transferring to 
Gelman Biotrace NT membrane (in 25 mM Tris. 192 mM glycine. 10 mM P-mercaptoethanol. 
0.05% SDS, 20% methanol), the membrane was blocked with 5% non-fat milk and \% goat- 
serum in Tris-buffered saline-Tween (TBS-T) for 12 hrs at 4°C. Immunostaining was done 
according to the protocol for the ECL western blotting detection reagents (Amersham Life 
Sciences. Buckingham. UK). The primary antibodies used were Dys-2 (Vector Laboratories) 
and anti-p-galactosidase (BMB. Indianapolis. IN) with a horseradish peroxidase-conjugated 
anti-mouse secondary antibody. 

Western blot analysis of EAM-infected mdx myoblasts and myotubes (three days post- 
fusion) indicated that EAMs were able to infect both of these cell types (Figure 15). In 
Figure 15, immunoblots of protein extracts from mdx myoblasts and myotubes demonstrating 
the expression of P-galactosidase (A) and dystrophin (B) in cells infected with AdSpdys 
EAMs. Total protein was extracted 3 days post infection in all cases. Myotubes were 
infected at three days following a switch to differentiation media. In Figure 15A, lane 1 
contains total protein extract from 293 cells infected with a virus expressing P-galactosidase 
as a control (RSV-LacZ); lanes 2-5 contain total protein extracts from uninfected mdx 
myoblasts, mdx myoblasts infected with AdSpdys EAMs, mdx myotubes and mdx myotubes 
derived from mdx myoblasts infected with Ad5pdys EAMs. respectively. In Figure 15B, 
lanes 1 and 7 contain total protein from mouse muscle ("C57'") while lane 2 contains protein 
from wild type MM14 myotubes, as controls. Lanes 3-5 contain total protein extracts from 
uninfected mdx myoblasts, mdx myoblasts infected with Ad5pdys EAMs, mdx myotubes and 
mdx myotubes derived from mdx myoblasts infected with Ad5pdys E.AMs. respectively. 
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As shown in Figure 15. expression of P-galactosidase was detected in both the infected 
mJx myoblasts and myotubes indicating that the CMV promoter was active at both these early 
stages of differentiation in muscle cells. However, only infected mdx myotubes produced 
protein detected by dys-2. an antibody recognizing the 17 C-lerminal amino acids of 
5 dystrophin (Fig. 15). No dystrophin expression was detected in infected myoblasts by western 
analysis, indicating that the muscle creatine kinase promoter functions mmimally. if at all, 
within the AdSpdys EAM prior to terminal differentiation of these cells. 

Dystrophin expression in mdx cells infected with EAMs was confirmed by 
immunofluorescence studies using N-terminal dystrophin antibodies. In agreement with the 

1 0 western analysis, dystrophin expression from the MCK promoter was detected only in 

differentiated mdx myotubes infected by AdSpdys EAMs (Fig. 16). Figures 16A-C show 
immunotluorescence of dystrophin in wild type MM 14 myotubes (A), uninfected mdx (B) and 
infected mdx myotubes (C), respectively. The results shown in Figure 16 demonstrate the 
transfer and expression of recombinant dystrophin to differentiated mdx cells by AdSpdys 

15 EAMs. 

Immunofluorescence of myogenic cells was performed as follows. Approximately 1.5 
X lO'^MMH or mdx myoblasts were plated on Poly-L-lysine (Sigma) coated glass slides (7 x 
3 cm) which had been previously etched with a 0.05% chromium potassium sulfate and 0.1% 
gelatin solution. For myotube analysis, the cultures were switched to differentiation media 

20 fClegg t7 uL (1987), supra] 48 hours after plating, immediately infected and then allowed to 
fuse for 3 days, whereas myoblasts were continuously propagated in proliferation media 
[Clegg et cd. (1987). supra]. Cells were washed three times with PBS at room temperature 
and fixed in 3.7% formaldehyde. For immunostaining. cells were incubated in 0.5% Triton 
X-100. blocked with 1% normal goat serum and incubated with an affinity purified antibody 

25 against the N-terminus of murine dystrophin for 2 hrs. followed by extensive washing in PBS 
and 0.1% Tween-20 with gentle shaking. Cells were incubated with a 1:200 dilution of biotin 
conjugated anti-rabbit antibody (Pierce) for one hour and washed as above. Cells were 
further incubated with a 1:300 dilution of streptavidin-fluorescein isothiocyanate conjugate 
(Vectorlabs. Burlingam. CA) for one hour and washed as above, followed by extensive 

30 washing in PBS. 

The above results show that embedded inverted Ad origins of replication coupled to an 
encapsidation signal can convert circular DNA molecules to linear forms in the presence of 
helper virus and that these genomes can be efficiently encapsidated and propagated to high 
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tilers. Such viruses can be purified on a CsCl gradient and maintain their ability to transduce 
cells 1)1 vifro, and iheir increased cloning capacity allows the inclusion of large genes and 
tissue specific gene regulatory elements. The above results also show that the dvstrophin gene 
was expressed in cells transduced by such viruses and that the protein product was correctiv 
5 localized to the cell membrane. The above method for preparing EAMs theoretically enables 
virtual 1> any gene of interest to be inserted into an infectious minichromosome bv 
conventional cloning in plasmid vectors, followed by cotransfection with helper \ iral DNA in 
293 cells. This approach is useful for a variety of gene transfer studies in vitro. The 
observation that vectors completely lacking viral genes can be used to transfer a full-length 
10 dystrophin cDNA into myogenic cells indicates that this method may be used for the 
treatment of DMD using gene therapy. 

d) A Modified MCK Enhancer Increases Expression Of Linked 
Genes In Muscle 

15 The DNA fragment containing enhancer/promoter of the MCK gene utilized in the 

AdSpdys EAM plasmid is quite large (--3.3. kb). In order to provide a smaller DNA fragment 
capable of directing high levels of expression of linked genes in muscle cells, portions of the 
3.3 kb MCK enhancer/promoter were deleted and/or modified and inserted in front of a 
reporter gene {lacZ). The enhancer element of the MCK gene was modified to produce the 

20 2RS5 enhancer; the sequence of the 2RS5 enhancer is provided in SEQ ID NO: 10. The first 
6 residues of SEQ ID NO: 10 represent a Kpn\ site added for ease of manipulation of the 
modified MCK enhancer element. Residue number 7 of SEQ ID NO: 10 corresponds to 
residue number 2164 of the wild-type MCK enhancer sequence listed in SEQ ID N0:9 
(position 2164 of SEQ ID N0:9 corresponds to position -1256 of the MCK gene). Residue 

25 number 174 of SEQ ID NO: 10 corresponds to residue number 2266 of the wild-type MCK 
enhancer sequence listed in SEQ ID N0:9. 

These MCKVlacZ constructs were used to transfect cells in culture (i.e., myogenic 
cultures) or were injected as naked DNA into the muscle of mice and p-galactosidase activity 
was measured. Figure 17 provides a schematic of the MCK/lacZ constructs tested. The first 

30 construct shown in Figure 17 contains the 3,3. kb wild-type MCK enhancer/promoter 

fragment linked to the E. coll lacZ gene. The wild-type enhancer element (-1256 to -1056) 
is depicted by the box containing "E": the core promoter element (-358 to -80) is indicated by 
the light cross-hatching and the minimal promoter element (-80 to --7) is indicated by the dark 
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cross-hatching. The core promoter element is required in addition to the minimal promoter 
clement (which is required for basal expression) in order to achieve increased muscle-specific 
expression in tissue culture [Shield et al. (1996) Mol. Cell. Biol. 16:5058]. The modified 
enhancer element, the 2RS5 enhancer, is indicated by the box labeled "E*." The box labeled 
"mi nx" contains a synthetic intron derived from adenovirus and is used to increase expression 
of the constructs (any intron may be utilized for this purpose). The box labeled "CMV" 
depicts the CMV IE enhancer/promoter which was used a positive control. 

in figure 17. p-galactosidase activity is expressed relative to the activity of the wild- 
type MCK enhancer/promoter construct shown at the top of the figure. As shown in Figure 
17. a construct containing the 2RS5 enhancer (SEQ ID NO: 10) linked to either the minimal 
MCK promoter (a - 261 bp element) or the core and minimal MCK promoter elements (a --539 
bp element) directs higher levels of expression of the reporter gene in muscle cells as 
compared to the -3.3 kb fragment containing the wild-type enhancer element. These modified 
enhancer/promoter elements are considerably smaller than the -3.3 kb fragment used in the 
Ad5pdys EAM plasmid and are useful for directing the expression of foreign genes in muscle 
cells. These smaller elements are particularly useful for driving the expression of genes in the 
context of self-propagating adenoviral vectors which have more severe constraints on the 
amount of foreign DNA which can be inserted in comparison to the used of "gutted" 
adenoviruses such as the EAMs described above. 

EXAMPLE 7 

Generation Of High Titer Stocks Of Encapsidated Adenovirus 
Minichromosomes Containing Minimal Helper Virus Contamination 

The results presented in Example 6 demonstrated that encapsidated adenovirus 
minichromosomes (EAMs) can be prepared that lack all viral genes and which can express 
full-length dystrophin cDNAs in a muscle specific manner. The propagation of these EAMs 
requires the presence of helper adenoviruses that contaminate the final EAM preparation with 
conventional adenoviruses (about 4% of the total preparation). In this example the EAM 
system is modified to enable the generation of high titer stocks of EAMs with minimal helper 
virus contamination. Preferably the EAM stocks contain helper virus representing less than 
1%. preferably less than 0.1% and most preferably less than 0.01% of the final viral isolate. 
Purified EAMs are then injected in vivo in muscles of dystrophin minus mdx mice to 
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determine whether these vectors lead to immune rejection and whether thev can alleviate 
dystrophic symptoms in mdx muscles. 

The amount of helper virus present in the EAM preparations is reduced in two ways. 
The first is by selectively controlling the relative packaging efficiency of the helper virus 
versus the EAM virus. The second is to improve physical methods for separating EAM from 
helper Mrus. These approaches enable the generation of dystrophin-expressing EAMs that are 
contaminated with minimal levels of helper virus. 

a) Development And Characterization Of Adenovirus Packaging 
Cell Lines Expressing The Cre Recombinase From 
Bacteriophage PI 

Cell lines expressing a range of Cre levels are used to optimize the amount of helper 
virus packaging that occurs during growth of the EAM vectors. The Cre-loxP system [Sauer 
and Henderson (1988) Proc. Natl. Acad. Sci. USA 85:5166] is employed to selectively disable 
helper \ irus packaging during growth of EAMs. The bacterial Cre recombinase catalyzes 
efficient recombination between a 34 bp target sequence called loxP. To delete a desired 
sequence. loxP sites are placed at each end of the sequence to be deleted in the same 
orientation; in the presence of Cre recombinase the intervening DNA segment is efficiently 
excised. 

Cell lines expressing a range of Cre recombinase levels are generated. The expression 
of too little Cre protein may result in high levels of helper virus being generated, which leads 
to unacceptably high levels of helper virus contaminating the final EAM preparation. If very 
high le\ els of Cre expression are present in a cell line, excision of the packaging signal from 
the helper virus would be 100% efficient (i.c^, it would completely prevent helper virus 
packaging). As shown in Example 6, serial passage of EAM preparations containing low 
levels of helper virus increased the titer of the EAM. Therefore, it is desirable, at least in the 
initial passages of a serial passage that some helper virus capable of being packaged is 
present. A low level of packagable helper virus may be provided by using a cell line 
expressing levels of Cre recombinase which are not high enough to achieve excision of the 
packaging signals from 100% of the helper virus: these cell lines would be used early in the 
serial passaging of the EAM stock and a cell line expressing high enough levels of Cre 
recombinanse to completely prevent helper virus packaging would be used for the final 
passage. 
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Alternativeh'. EAMs may be prepared using a packaging cell line that supports high 
efficiency Cre recombinase-mediated excision of the packaging signals from the helper virus 
by transfection of the Ot^-expressing cell line with the EAM plasmid followed by infection of 
these cells with /av/'-conianing helper virus using an MOI of > 1,0. 

Human 293 cell lines that express a variety of levels of Crc recombinase are generated 
as follows. An expression vector containing the Cre coding region. pOG231, was 
cotransfected into 293 cells along with pcDNA3 (Invitrogen): pcDNA3 contains the neo gene 
under the control of the SV40 promoter. pOG231 uses the human CMV enhancer/promoter 
[derived from CDM8 (Invitrogen)] to express a modified Cre gene [pOG231 was obtained 
from S. O'Gorman]. The modified Cre gene contained within pOG231 has had a nuclear 
localization signal inserted into the coding region to increase the efficiency of recombination. 

pOG231 was constructed as follows. A BglW site was introduced into the 5' Xbal site 
of the s> nthethic intron of pMLSISCAT [Haung and Gorman (1990) Nucleic Acids Res. 
18:937] by linker tailing. A Bgm site in the synthethic intron of pMLSISCAT was destroyed 
and a BamWX linker was inserted into the Pst\ site at the 3' end of the synthethic intron in 
pMLSISCAT. BglW and Smal sites and a nuclear localization signal were introduced into the 
5* end of pMC-Cre [Gu et al. (1993) Cell 73:1 155] using a PCR fragment that extended from 
the novel BglW and Smal sites to the BamWl site in the Cre coding region. This PCR 
fragment was ligated to a BamWltSaR fragment containing a portion of the Cre coding region 
derived from plC-Cre (Gu et al., supra) and the intron plus Cre coding sequence was inserted 
into a modified form of pOG44 [O'Gorman et al. (1991) Science 251:1351] to generate 
pOG231. The predicted sequence of pOG231 from the 5^/11 site to the BamHl site located in 
the middle of the Cre coding sequence is listed in SEQ ID NO: 11. 

One 60 mm dish of 293 cells (Microbix) were transfected with 10 fig of PvwII- 
linearized pOG23I and 1 ^g of A^o/I- linearized pcDNA3 using a standard calcium phosphate 
precipitation protocol. Two days after the addition of DNA. the transfected cells were split 
into three 100 mm dishes and 1000 |ig/ml of active G418 was added to the medium. The 
cells were fed periodically with G41 8-contaning medium and three weeks later. 24 G418- 
resistani clones were isolated. 

The isolated clones were expanded for testing. Aliquots were frozen in liquid nitrogen 
at the earliest possible passage. The neomycin resistant cell lines were examined for the 
expression of Cre recombinase using following transfection assay. The neomycin resistant 
cells were transfected with PGK-l-GFP-lacZ (obtained from Sally Camper. Univ. of 
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Michigan, Ann Arbor. MI), which contains a green fluoresceni protein (GFP) expression 
cassette that can be excised by Cre recombinase to allow expression ol' (i-galactosidase: 
iransfection was accomplised using standard calcium phosphate precipitation. Figure 18 
provides a schematic of a GFP/p-gal reporter construct suitable for assaying the expression of 
Cre recombmase in mammalian cells; GFP sequences and f3-gal (i.e., lacZ) sequences are 
avaialbale from commmerical sources (e-.^., Clonetech. Palo Alto. CA and Pharmacia Biotech, 
Piscataway, NJ. respectively). 

Control experiments verified that 293 cells transfected with PGK-l-GFP-lacZ 
expressed significant amounts of p-galactosidase only if these cells also expressed Cre 
recombinanse. p-galactosidase assays were performed as described in Example 6. Neomycin 
resistant cells expressing Cre recombinase were grouped as high, medium or low expressors 
based upon the amount of p-galactosidase activity produced (estimated by direct counting of 
p-galactosidase-positive cells per high-power field and by observing the level of staining and 
the rapidity with which the blue stain was apparent) when these cell lines were transfected 
with the GFP/p-gal reporter construct. Thirteen positive (i.e., CVc^-expressing) lines. D608#12. 
#7. #22. #18, #17, #4, #8, #2, #2/2, #13, #5. #15 and #21 were retained for further use. 

The results of this transfection analysis revealed that cultures of 293 cells expressing 
medium to high levels of Cre recombinase could be generated without apparent toxicity. 

b) Generation Of Helper Adenovirus Strains That Contain loxP 
Sites Flanking The Adenovirus Packaging Signals 

Studies of EAM production demonstrated that the EAM vector has a packaging 
advantage over the helper adenovirus (Ex. 6). While not limiting the present invention to a 
particular mechanism, it is hypothesized that this packaging and replication advantage can be 
greatly increased by using helper viruses that approach the packaging size limits of Ad5 [Bett 
el al. (1993) J. Virol. 67:591 1], by using viruses with mutations in E4 and/or E2 genes [Yeh 
et al. (1996) J. Virol. 70:559: Gorziglia et al. (1996) J. Virol. 70:4173; and Amalfitano et al. 
(1996) Proc. Natl. Acad. Sci. USA 93:3352], by inclusion of mutations or alterations in the 
packaging signals of the helper virus [Imler et al. (1995) Hum. Gene Ther. 6:71 1] and by 
combining these strategies. 

The Cre-loxP excision method is used to disable the packaging signals from the helper 
virus genomes. The Ad5 packaging domain extends from nucleotide 194 to 358 and is 
composed of five distinct elements that are functionally redundant [Hearing el ai (1987) J. 
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Virol. 61:2555]. Theoretically, any molecule containing the Ad5 origin of replication and 
packaging elements should replicate and be packaged into mature virions in the presence of 
non-defective helper virus. Disabling the packaging signals should allow replication and gene 
expression to proceed, but will prevent packaging of viral DNA into infectious particles. This 
5 in turn should allow the ratio of EAM to helper virus to be increased greatlv. 

To disable the packaging signals within the helper virus used to encapsidate the EAMs, 
the loxP sequences are incorporated into a helper virus that has a genome approaching the 
maximal packaging size for Ad [--105 map units; Bett et ai (1993). supra] which further 
decreases the efficiency of helper virus packaging. Final virus size can be adjusted by the 

10 choice of introns inserted into the reporter gene, by the choice of reporter genes, or by 
including a variety of DNA fragments of various sizes to act as "staffer" fragments. A 
convenient reporter gene is the alkaline phosphatase gene (see Ex. 6). The optimized Cre- 
loxP s\ stem is also incorporated into a helper viral backbone containing disruptions of anv or 
all of the E1-E4 genes. Use of such deleted genomes requires viral growth on appropriate 

15 complementing cell lines, such as 293 cells expressing E2, and/or E4 gene products. The 
loxP sequences are incorporated into the helper virus by placing the loxP 
sequences on either side of the packaging signals on a shuttle vector. This modified shuttle 
vector is then used to recombine with the Ad DNA derived from a virus containing 
disruptions of any or all of the E1-E4 genes to produce the desired helper virus containing the 

20 packaging signals flanked by loxP sequences. 

pAdBglll. an adenovirus shuttle plasmid containing Ad sequences from 0-1 and 9-16 
map units (mu) was used as the starting material. Synthetic oligonucleotides were used to 
create a polylinker which was inserted at 1 mu within pAdBglll as follows. The Bglll LoxP 
oligo [5--GAAGATCTATAACTTCGTATAATGTATGCTA 

25 TACGA.AGTTATTACCGAAGAAATGGCTCGAGATCTTCO- (SEQ ID NO: 12) and its 
reverese complement [5'-GAAGATCTCGAGCCATTTCTTCGGTAATAACTTCGT 
ATAGCATACATTATACGAAGTTATAGATCTTC-3' (SEQ ID NO; 13)] were synthesized. 
The Afllll LoxP oligo [5'-CCACATGTATAACTTCGTATAGCATACA 
TTATACGAAGTTATACATGTGG-3* (SEQ ID N0:14)] and its reverse complement [5'- 

30 CCACATGTATAACTTCGTATAATGTATGCTATACGAAGTTATACATG TGG-3' (SEQ 
ID NO: 15)] were synthesized. The double stranded form of each loxP oligonuceotide was 
digested with the appropriate restriction enzyme {e.g.. Bglll or Afllll) and inserted into 
pAdBglll which had been digested with Bglll and Afllll. This resulted in the insertion of 
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LoxP sequences into the shuttle vector Hanking the packaging signals that are located between 
0.5 and 1 mu of Ad5 (one mu equals 360 bp in the Ad5 genome; the sequence of the Ad5 
genome is Isiied in SEQ ID N0:1). The 3' loxP sequence was inserted into the Bgdl site 
within pAdBglll. The 5' loxP sequence was inserted into the site located at base 143 

5 {--0.8 mu) (numbering relative to Ad5). 

DNA sequencing was used to verify the final structure of the modified shuule plasmid 
between 0-1 mu. If Cre recombinase-mediated excision of the packaging signals is found to 
be too efficient (as judged by the production of too little helper virus), alternate sites for the 
insertion of the loxP sequences are used that would result in deletion of 2, 3 or 4 packaging 

10 sequences rather than all 5 [Grable and Hearing (1990) J. Virol. 64:2047]. The insertion of 
loxP sequences at sites along the Ad genome contained within the shuttle vector which would 
result in the deletion 2. 3 or 4 packaging sequences are easily made using the technique of 
recombinant PCR [Higuchi (1990) In: PCR Protocols: A Guide to Methods and 
Applications, Innis et al. (eds.) Academic Press, San Diego, CA, pp. 177-183]. The optimal 

15 amount of Crc recombinase-mediated excision of the packaging signals is that amount which 
permits the production of enough packaged helper virus to permit the slow spread of virus on 
the first plate of cells co-transfected with helper virus DNA (containing the loxP sequences) 
and the EAM vector. This permits serial passage of the EAM preparation onto a subsequent 
lawn of cells to increase the titer of the EAM preparation. Alternatively, if Cre recombinase- 

20 mediated excision of the packaging signals is essentially 100% efficient in all Cre-expressing 
cells lines {i.e. regardless of the level of Cre-expression. that is even a cell line expressing a 
low level of Crc as judged by the GFP/p-gal assay described above), the packaged EAMs 
may be used along with helper virus (used at a MOI of --1 .0) to infect the second or 
subsequent lawn of cells to permit serial passaging to increase the titer of the EAM 

25 preparation. 

Following introduction of the loxP sequences into the shuttle vector, the human 
placental alkaline phosphatase (HpAp) cDNA under control of the RSV promoter was inserted 
into the polylinker to provide a reporter gene for the helper virus. This is the same reporter 
used previously during EAM generation (Ex. 6). The HpAp sequences were inserted as 
30 follows. The /avP-containing shuttle vector was linearized with Xhol and the hpAp cassette 
was ligaicd into the Xhol site (a XTiol site was inserted into pAdBglll during the insertion of 
the loxP sequences as a XJiol site was located on the 3" end of the loxP sequences inserted 
into the B^^IU site of pAdBglll). The HpAp cassette was constructed as follows. 
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pRSVhAPT40 (obtained from Gary Nabel. Univ. of Michigan. Ann Arbor. MI) was digested 
with EcoRl to generate an EcoRl fragment containing the HpAP cDNA and the SV40 intron 
and polyadenylation sequences. pRc/RSV (Invitrogen) was digested with Hindlll, then 
partially digested with EcoRl. A 5,208 bp fragment was then size selected on an agarose gel, 
treated with calf alkaline phosphatase, and ligated to the EcoRl fragment derived from 
pRSVhAPT40 to generate pRc/RSVAP. pRc/RSVAP was then digested with Sail and Xhol to 
liberate the RSV promoter linked to the HpAP cDNA cassette (including the SV40 intron and 
polyadenylation sequences). This Sall-Xhol fragment was inserted into the /ar/^-containing 
shuttle \ ector which had been digested with Sail and Xhol to generate pADLoxP-RSVAP. 

Helper virus containing loxP sites flanking the Ad packaging signals is generated by 
co-transfection of the LoxP shuttle plasmid (pADLoxP-RSVAP) and C7al-digested 
AdSdllQOl DNA into 293 cells [Graham and Prevec (1991) Manipulation of Adenovirus 
Vectors. In Methods in Molecular Biology, Vol. ". Gene Transfer and Expression Protocols. 
Murray (ed.). Humana Press Inc.. Clifton, NJ, pp. 109-128]. Figure 19 provides a schematic 
showing the recombination event between the /ox/' shuttle vector and the Ad5 J/7001 genome. 
Co-transfection is carried out as described in Example 3c. 

Alternatively, the reporter gene may be inserted into the E3 region of pBHGlO or 
pBHGI 1 (using the unique Pad site) rather than into the poly linker located in the El region 
of the shuttle vector. The reporter gene-containing pBHGlO or 11 is then used in place of 
Ad5c//7001 for cotransfection of 293 cells along with the loxP-containing pAdBglll. 

Following cotransfection, recombinant plaques are picked, plaque purified, and tested 
for incorporation of both hpAp and loxP sequences by PGR and Southern analysis (Ex. 3). 
Viruses which contain loxP sites flanking the packaging signals and the marker gene (hpAp) 
are retained, propagated and purified. 

The isolated helper virus containing loxP sites flanking the packaging signals and the 
marker gene is then used to infect both 293 cells and the 293 cell lines that express Cre 
(section a. above). Cre recombinase-expressing cell lines that produce optimal levels of Cre 
recombinase when infected with the ^Jc/'-containing helper virus are then used for the 
generation of EAMs as described below. 
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c) Generation Of Encapsidated Adenovirus Minichromosomes 
That Express Dystrophin 

The growth of EAMs is optimized using the following methods. In the first method, 
plasmid DNA from the EAM vector (Ex. 6) are co-transfecied into 293 cells with purified 
viral DNA from the helper virus, and viruses are harvested 10-14 days later after appearance 
of a viral cytopathic effect (CPE) as described in Example 6. This approach is simple, yet 
potentially increases helper virus levels by allowing viral spread throughout the culture dishes. 

In the second method, the co-transfected 293 cells arc overlaid with agar and single 
plaques are picked and tested for EAM activity by the ability to express P-galactosidase 
activity following infection of Hela cells (Ex. 6). This second approach is more time 
consuming, but will result in less contamination by helper. 

The ability of these two methods to generate EAMs are directly compared. The 
preferred method is that which produces the highest titer of EAM with the lowest 
contamination of helper virus. The efficiency of EAM generation may also be increased by 
using helper virus DNA that retains the terminal protein (TP) (i.e., the helper virus is used as 
a viral DNA-TPC as described in Ex. 3) The use of viral DNA-TPC has been shown to 
increase the efficiency of viral production following transfection of 293 cells by an order of 
magnitude [Miyake et al. (1996), supra]. 

The initial transfection will utilize 293 cells that do not express Cre recombinase. so 
that efficient spread of the helper can lead to large scale production of EAM. The method 
producing the highest ratio of EAM to helper virus is then employed to optimize conditions 
for the serial propagation of the EAM on 293 cells expressing Cre recombinase. This 
optimization is conducted using cell lines expressing different levels of Cre recombinase. The 
following variables are tested: 1) the ratio of input viral titer to cell number, 2) the number 
of serial passages to use for EAM generation, 3) the use of cell lines producing different 
levels of Cre to achieve the optimal ratio between high EAM titer and low helper titer, 4) 
continuous growth on Cre-producing 293 cells versus alternating between Cre-producing cells 
and the parental 293 cells, or to alternate between high and low Cre-producing cells and 5) 
CsCl purification of EAMs prior to re-infection of 293 cells increases the ratio of the final 
EAM/helper titers. The protocols that result in the highest yield of EAM with minimal helper 
virus are used to generate large volumes of crude viral lysates for purification by density 
gradient centrifugation (Ex. 6). 
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EAMs were purified from helper virus using standard CsCl density gradient 
ceniritugation protocols in Example 6 and resulted in preparations containing -4% helper virus. 
In order to improve ihe physical separation of EAMs and helper virus, a variety of different 
centrifugation conditions are possible, including changing the gradient shape, type of rotor and 
tube used, combinations of step and continuous gradients, and the number of gradients used. 
Materials with better resolving powers than CsCl can also be employed. These include 
rubidium chloride and potassium bromide [Reich and Zarybnicky (1979) Annal. Biochem. 
94:193]. 



10 d) Use Of EAMs Encoding Dystrophin For Long Term 

Expression Of Dystrophin in Muscle 

To demonstrate the ability of the purified dystrophin expressing EAMs (prepared as 
described above) to deliver and express dystrophin in the muscle of an animal, the following 
experiment is conducted. First, purified dystrophin-EAMs are delivered to the muscle by 

15 direct intramuscular injection into newborn. 1 month, and adult (3 month) mouse quadriceps 
muscle. The dystrophin expressing EAM AdSpDys are injected into mdx mice and into 
transgenic mdx mice that express P-galactosidase in the pituitary gland [Tripathy et ai (1996) 
Nature Med. 2:545]. These latter mice are used to avoid potential immune-rejection of cells 
expressing P-galactosidase from the Ad5pDys vector. An alternate EAM lacking the P- 

20 galactosidase reporter gene may also be employed; however, the presence of the P-Gal 

reporter simplifies EAM growth and purification (vectors lacking p-Gal have their purity 
estimated by PGR assays rather than by p-Gal assays). 

Following intramuscular injection of EAM. animals are sacrificed at intervals between 
1 week and 6 months to measure dystrophin expression [by western blot analysis and by 

25 immunofluorescence (Phelps et ai (1995) Hum. Mol. Genet. 4:1251 and Rafael et ai (1996) 
J. Cell Biol. 134:93] and muscle extracts will also be assayed for P-Gal activity [MacGregor 
et ai (1991), supra]. These results are compared with previous results obtained using current 
generation viral vectors {i,e,. containing deletions in El and E3 only) to demonstrate that 
EAMs improve the prospects for long term gene expression in muscle. 
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e) Use Of Dystrophin-EAMs To Prevent, Halt, Or Reverse The 
Dystrophic Symptoms That Develop In The Muscles Of mdx 
Mice 

To demonstrate that beneficial effects on dystrophic muscle are achieved by delivery 
of dystrophin expressmg EAMs to mcbc mice, the following experiments are conducted. 
Central nuclei counts are performed on soleus muscle at intervals following injection of EAM 
mto newborn. 1 month, and 3 month (adult) mice (Phelps et ai, 1995). Central nuclei arise 
m mouse muscle only after a myofiber has undergone dystrophic necrosis followed by 
regeneration, and is a quantitative measure of the degree of dystrophy that has occurred in a 
10 muscle group (Phelps el a!.. 1995). 

More informative assays are also contemplated. These assays require administration of 
EAMs to mouse diaphragm muscles. 

The diaphragm is severely affected in mdx mice (Stedman ei ai. 1991: Cox et ai. 
1993). and displays dramatic decreases in both force and power generation. Administration of 
15 \ irus to the diaphragm will allow the strength of the muscles to be measured at intervals 
following dystrophin delivery. The force and power generating assays developed by the 
Faulkner lab are used to measure the effect of dystrophin transgenes (Shrager et ai. 1992; 
Rafael et ai. 1994: Lynch et ai. 1996). 

To detemine whether dystrophin delivery to dystrophic muscle reverses dystrophy or 

20 stabilizes muscle, varying amounts of the dystrophin EAM are delivered to the to diaphragm 

at different stages of the dystrophic process and then strength is measured at intervals 

following EAM administration. First, various titers of EAM are tested to determine the 

minimal amount of virus needed to transduce the majority of muscle fibers in the diaphragm. 

It has been shown that conventional adenovirus vectors can transduce the majority of 

8 

25 diaphragm fibers when 10 pfu are administrated by direct injection into the intraperitoneal 
cavity [Huard et ai (1995) Gene Therapy 2:107]. In addition, it has been shown that 
transduction of a simple majority of fibers in a muscle group is sufficient to prevent virtually 
all the dystrophic symptoms in mice [Rafael et ai (1994) Hum. Mol. Genet. 3:1725 and 
Phelps et ai (1995) Hum. Mol. Genet. 4:1251]. Virus is administered to mdx animals at three 

30 different ages (neonatal, 1 month, and 3 months). Animals are sacrificed for physiological 
analysis of diaphragm muscle at two different times post infection (I month and 3 months). 
Error control is achieved by performing these experiments in sextuplicate. Control animals 
consist of mock injected wild-type (C57B1/10) and dystrophic (mdx) mice. Tliree month old 
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mdx mice display a 40% reduction in force and power generation compared with wild-type 
mice, while 6 month animals display a greater than 50% reduction (Cox a ai (1993) Nature 
364:725; Rafael e( ai (1994). supra\ Phelps et ai (1995). supra: and Corrado a cil. (1996) J. 
Cell. BioL 134:873]. 



EXAMPLE 8 

Improved Shuttle Vectors For The 
Production Of Helper Virus Containing LoxP Sites 



In the previous example, a shuttle vector containing adenoviral sequences extending 
from 9 mu to 16 mu of the Ad5 genome was modified to contain loxP sequences surrounding 
the packaging signals (the LoxP shuttle vector). This modified shuttle vector was then 
recombined with an Ad virus to produce a helper virus containing the loxP sequences. This 
helper virus was then used to infect cells expressing Cre recombinase along with DNA 
comprising a minichromosome containing the dystrophin gene, a reporter gene and the 
packaging signals and ITRs of Ad in order to preferentially package the minichromosomes. 
Using this approach the helper virus, which has had the packaging signals removed by Cre- 
loxP recombination, contains the majority of the Ad genome (only a portion of the E3 region 
is deleted). Thus, if low levels of helper virus are packaged and appear in the EAM 
preparation, the EAM preparation has the potential of passing on helper virus capable of 
directing the expression of Ad proteins in cells which are exposed to the EAM preparation. 
The expression of Ad proteins may lead to an immune response directed against the infected 
cells. 

Another approach to reducing the possibility that the EAM preparation contains helper 
virus capable of provoking an immune response is to use helper viruses containing deletions 
and/or mutations within the pol and pTP genes. Helper virus containing a deletion in the pol 
and/or pTP genes is cotransfected with the EAM construct into 293-derived cell lines 
expressing pol or pol and pTP to produce EAMs. Any helper virus present in the purified 
EAM preparation will be replication defective due to the deletion in the pol and/or pT? genes. 
As shown in Example 3, viruses containing a deletion in the pol gene are incapable of 
directing the expression of viral late genes: therefore, helper viruses containing a deletion in 
the pol gene or the pol and pTP genes should not be capable of provoking an immune 
response (i.e.. a CTL response) against late viral proteins synthesized de novo. Shuttle 
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vectors contain deletions within the Ad pol and/or pTP genes are constructed as described 
below. 

a) Construction Of A Shuttle Vector Containing The Apol 
Deletion 

pAdBglll was modified to contain sequences corresponding to 9 to 40 mu of the Ad5 
genome as follows. pAdBglll was digested with 5^/11 and a linker/adapter containing an Ascl 
site was added to create pAdBglllAsc. pAdBglllAsc was then digested with Z?5/ 11071. 
Ad5cin00\ viral DNA was digested with Asd and the ends were filled in using T4 DNA 
polymerase. The /i.vcl-digested, T4 DNA polymerase filled AdSdllOO] viral DNA was then 

10 digested with Bstl 1071 and the -9.9 kb Asd-Bstl 107I(filled) fragment containing the pol and 
pTP genes was isolated (as described in Ex. 3) and ligated to the B.v/l 107r-digested 
pAdBglllAsc to generate pAdAsc. pAdAsc is a shuttle vector which contains the genes 
encoding the DNA polymerase and pretermmal protein (the inserted Asd-Bs(\ 1071 fragment 
corresponds to nucleotides 5767 to 15, 671 of the Ad5 genome). 

'5 A shuttle vector. pAdAscApol, which contains the 612 bp deletion in the pol gene 

(described in Ex. 3) was constructed as follows. Ad5ApolAE3I viral DNA (Ex. 3) was 
digested with Asd and the ends were filled in using T4 DNA polymerase. The yi:?d-digested, 
T4 DNA polymerase filled Ad5ApolAE3I viral DNA was then digested with Bst\ 1071 and the 
-9.3 kb .-i.icl-5^7ll071(filled) fragment containing the deleted pol gene and the pTP gene was 

20 isolated (as described in Ex. 3) and ligated to &/1107I-digested pAdBglllAsc to generate 
pAdAscApol. 

b) Construction Of A Shuttle Vector Containing The ApolApTP 
Deletion 

A shuttle vector, pAdAscApolApTP. which contains a 2.3 kb deletion within the pol 

25 and pTP genes (described in Ex. 5 ) was constructed as follows. 

pAXBApolApTPVARNA+tl3 (Ex. 5b) was digested with Asd and the ends were filled in 
using T4 DNA polymerase. The /i.sd-digested, T4 DNA polymerase filled 
pAXBApolApTPVARNA+tl3 DNA was then digested with 5^/11071 and the -7.6 kb Asd- 
5.S71 107I(filled) fragment containing the deleted pol gene and the pTP gene was isolated (as 

30 described in Ex. 3) and ligated to fori 1071-digested pAdBglllAsc to generate 
pAdAscApolApTP. 

In order to reduce the packaging of the above helper viruses, the pol" or pol", pTP" 
helper \ iruses can be modified to incorporate loxP sequences on either side of the packaging 
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signals as outlined in Example 7. The ioxP-coniainmg pol" or pol". pTP~ shunle vectors 
(pTP- shuttle vectors may also be employed) are cotransfected into 293 cells expressing Cre 
recombinase and pol or Cre recombinase. pol and pTP, respectively along with an appropriate 
El" viral DNA-TPC (the El- viral DNA may also contain deletions elsewhere m the genome 
5 such as in the E4 genes or in the E2a gene as packaging cell lines expressing the E4 ORE 6 
or 6/7 and lines expressing E2a genes are avaialble) to generate helper virus containing loxP 
sites flanking the packaging signals as well as a deletion in the pol gene or the pol and pTP 
genes. The resulting helper virus(es) is used to cotransfect 293 cells expressing pol or pol 
and pTP along with the desired EAM construct. The resulting EAM preparation should 

10 contain little if any helper virus and any contaminating helper virus present would be 

replication defective and incapable of expressing viral late gene products. Helper viruses 
containing loxP sequences and deletions in all essential early genes may be employed in 
conjunction with Cre recombinanse-expressing cell lines expressing in irans the El. E4 ORE 
6. E2a. and E2b {e.g.. Ad polymerase and pTP) proteins (the E3 proteins are dispensible for 

15 growth in culture). Cell lines coexpressing El, polymerase and pTP are provided herein. 

Cell lines expressing El and E4 proteins have been recently described [Krougliak and Graham 
(1995) Hum. Gene Ther. 6:1575 and Wang et ai (1995) Gene Ther. 2:775] and cell lines 
expressing El and E2a proteins have been recently described [Zhou et ai (1996) J. Virol. 
70:7030). Therefore, a cell line co-expressing ET E2a, E2b. and E4 is consrucied by 

20 introduction of expression plasmids containing the E2a and E4 coding regions into the E1-. 

Ad polymerase- and pTP-expressing cell lines of the present invention. These packaging cell 
lines are used in conjunction with helper viruses containing deletions in the EL E2a. E2b and 
E4 regions. 

All publications and patents mentioned in the above specification are herein 
25 incorporated by reference. Various modifications and variations of the described method and 
system of the invention will be apparent to those skilled in the art without departing from the 
scope and spirit of the invention. Although the invention has been described in connection 
with specific preferred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various modifications of 
30 the described modes for carrying out the invention which are obvious to those skilled in 
molecular biology or related fields are intended to be within the scope of the following 
claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Chamberlain, Jeffreys. 

Amalfitano, Andrea 
Hauser, Michael A. 
Kumar-Singh, Rajendra 
Hartigan-0' Connor , Dennis J. 

(11) TITLE OF INVENTION; IMPROVED ADENOVIRUS VECTORS 

(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Median & Carroll, LLP 

(B) STREET: 220 Montgomery Street, Suite 2200 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: United States Of America 

(F) ZIP: 94104 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1,30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 
{B) FILING DATE: 
(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Ingolia, Diane E. 

(B) REGISTRATION NUMBER: 40,027 

(C) REFERENCE /DOCKET NUMBER: UM- 024 84 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 705-8410 

(B) TELEFAX: (415) 397-8338 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35935 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



CATCATCAAT 


AATATACCTT 


ATTTTGGATT 


GAAGCCAATA 


TGATAATGAG 


GGGGTGGAGT 


60 


TTGTGACGTG 


GCGCGGGGCG 


TGGGAACGGG 


GCGGGTGACG 


TAGTAGTGTG 


GCGGAAGTGT 


120 


GATGTTGCAA 


GTGTGGCGGA 


ACACATGTAA 


GCGACGGATG 


TGGCAAAAGT 


GACGTTTTTG 


180 


GTGTGCGCCG 


GTGTACACAG 


GAAGTGACAA 


TTTTCGCGCG 


GTTTTAGGCG 


GATGTTGTAG 


240 


TAAATTTGGG 


CGTAACCGAG 


TAAGATTTGG 


CCATTTTCGC 


GGGAAAACTG 


AATAAGAGGA 


300 


AGTGAAATCT 


GAATAATTTT 


GTGTTACTCA 


TAGCGCGTAA 


TATTTGTCTA 


GGGCCGCGGG 


360 


GACTTTGACC 


GTTTACGTGG 


AGACTCGCCC 


AGGTGTTTTT 


CTCAGGTGTT 


TTCCGCGTTC 


420 


CGGGTCAAAG 


TTGGCGTTTT 


ATTATTATAG 


TCAGCTGACG 


TGTAGTGTAT 


TTATACCCGG 


480 
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TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 54 0 

TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 6 00 

AATGGCCGCC AGTCTTTTGG ACCAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC 66 0 

TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 72 0 

CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GACTCTGTAA TGTTGGCGGT 78 0 

GCAGGAAGGG ATTGACTTAC TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 84 0 

CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 90 0 

CCTTGTACCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 96 0 

CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 102 0 

CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 108 0 

CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAATTATGGG CAGTGGGTGA 114 0 

TAGAGTGGTG GGTTTGGTGT GGTAATTTTT TTTTTAATTT TTACAGTTTT GTGGTTTAAA 120 0 

GAATTTTGTA TTGTGATTTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 126 0 

CCAGAACCGG AGCCTGCAAG ACCTACCCGC CGTCCTAAAA TGGCGCCTGC TATCCTGAGA 1320 

CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 13 80 

CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 1440 

GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 1500 

CCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 156 0 

TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 1620 

GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 1680 

CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 174 0 

TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 18 00 

TTTCTGTGGG GCTCATCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 1860 

GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 192 0 

CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 196 0 

GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 204 0 

AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT TGTGAGACAC 2100 

AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCGA TAATACCGAC GGAGGAGCAG 216 0 

CAGCAGCAGC AGGAGGAAGC CAGGCGGCGG CGGCAGGAGC AGAGCCCATG GAACCCGAGA 2220 

GCCGGCCTGG ACCCTCGGGA ATGAATGTTG TACAGGTGGC TGAACTGTAT CCAGAACTGA 228 0 

GACGCATTTT GACAATTACA GAGGATGGGC AGGGGCTAAA GGGGGTAAAG AGGGAGCGGG 2 340 

GGGCTTGTGA GGCTACAGAG GAGGCTAGGA ATCTAGCTTT TAGCTTAATG ACCAGACACC 2400 

GTCCTGAGTG TATTACTTTT CAACAGATCA AGGATAATTG CGCTAATGAG CTTGATCTGC 24 6 0 

TGGCGCAGAA GTATTCCATA GAGCAGCTGA CCACTTACTG GCTGCAGCCA GGGGATGATT 2 520 

TTGAGGAGGC TATTAGGGTA TATGCAAAGG TGGCACTTAG GCCAGATTGC AAGTACAAGA 2 5 80 
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TCAGCAAACT TGTAAATATC AGGAATTGTT GCTACATTTC TGGGAACGGG GCCGAGGTGG 2 64 0 

AGATAGATAC GGAGGATAGG GTGGCCTTTA GATGTAGCAT GATAAATATG TGGCCGGGGG '2 700 

TGCTTGGCAT GGACGGGGTG GTTATTATGA ATGTAAGGTT TACTGGCCCC AATTTTAGCG 2 76 0 

GTACGGTTTT CCTGGCCAAT ACCAACCTTA TCCTACACGG TGTAAGCTTC TATGGGTTTA 2 820 

ACAATACCTG TGTGGAAGCC TGGACCGATG TAAGGGTTCG GGGCTGTGCC TTTTACTGCT 2 8 80 

GCTGGAAGGG GGTGGTGTGT CGCCCCAAAA GCAGGGCTTC AATTAAGAAA TGCCTCTTTG 2 940 

AAAGGTGTAC CTTGGGTATC CTGTCTGAGG GTAACTCCAG GGTGCGCCAC AATGTGGCCT 3 0 00 

CCGACTGTGG TTGCTTCATG CTAGTGAAAA GCGTGGCTGT GATTAAGCAT AACATGGTAT 3 060 

GTGGCAACTG CGAGGACAGG GCCTCTCAGA TGCTGACCTG CTCGGACGGC AACTGTCACC 3120 

TGCTGAAGAC CATTCACGTA GCCAGCCACT CTCGCAAGGC CTGGCCAGTG TTTGAGCATA 3i80 

ACATACTGAC CCGCTGTTCC TTGCATTTGG GTAACAGGAG GGGGGTGTTC CTACCTTACC 3 240 

AATGCAATTT GAGTCACACT AAGATATTGC TTGAGCCCGA GAGCATGTCC AAGGTGAACC 3 3 00 

TGAACGGGGT GTTTGACATG ACCATGAAGA TCTGGAAGGT GCTGAGGTAC GATGAGACCC 3 36 0 

GCACCAGGTG CAGACCCTGC GAGTGTGGCG GTAAACATAT TAGGAACCAG CCTGTGATGC 3 42 0 

TGGATGTGAC CGAGGAGCTG AGGCCCGATC ACTTGGTGCT GGCCTGCACC CGCGCTGAGT 3 48 0 

TTGGCTCTAG CGATGAAGAT ACAGATTGAG GTACTGAAAT GTGTGGGCGT GGCTTAAGGG 3 54 0 

TGGGAAAGAA TATATAAGGT GGGGGTCTTA TGTAGTTTTG TATCTGTTTT GCAGCAGCCG 3 6 00 

CCGCCGCCAT GAGCACCAAC TCGTTTGATG GAAGCATTGT GAGCTCATAT TTGACAACGC 3 66 0 

GCATGCCCCC ATGGGCCGGG GTGCGTCAGA ATGTGATGGG CTCCAGCATT GATGGTCGCC 3 720 

CCGTCCTGCC CGCAAACTCT ACTACCTTGA CCTACGAGAC CGTGTCTGGA ACGCCGTTGG 3 78 0 

AGACTGCAGC CTCCGCCGCC GCTTCAGCCG CTGCAGCCAC CGCCCGCGGG ATTGTGACTG 3 84 0 

ACTTTGCTTT CCTGAGCCCG CTTGCAAGCA GTGCAGCTTC CCGTTCATCC GCCCGCGATG 3 900 

ACAAGTTGAC GGCTCTTTTG GCACAATTGG ATTCTTTGAC CCGGGAACTT AATGTCGTTT 3 960 

CTCAGCAGCT GTTGGATCTG CGCCAGCAGG TTTCTGCCCT GAAGGCTTCC TCCCCTCCCA 4 02 0 

ATGCGGTTTA AAACATAAAT AAAAAACCAG ACTCTGTTTG GATTTGGATC AAGCAAGTGT 4 08 0 

CTTGCTGTCT TTATTTAGGG GTTTTGCGCG CGCGGTAGGC CCGGGACCAG CGGTCTCGGT 414 0 

CGTTGAGGGT CCTGTGTATT TTTTCCAGGA CGTGGTAAAG GTGACTCTGG ATGTTCAGAT 4200 

ACATGGGCAT AAGCCCGTCT CTGGGGTGGA GGTAGCACCA CTGCAGAGCT TCATGCTGCG 4 260 

GGGTGGTGTT GTAGATGATC CAGTCGTAGC AGGAGCGCTG GGCGTGGTGC CTAAAAATGT 4320 

CTTTCAGTAG CAAGCTGATT GCCAGGGGCA GGCCCTTGGT GTAAGTGTTT ACAAAGCGGT 4 380 

TAAGCTGGGA TGGGTGCATA CGTGGGGATA TGAGATGCAT CTTGGACTGT ATTTTTAGGT 444 0 

TGGCTATGTT CCCAGCCATA TCCCTCCGGG GATTCATGTT GTGCAGAACC ACCAGCACAG 4 500 

TGTATCCGGT GCACTTGGGA AATTTGTCAT GTAGCTTAGA AGGAAATGCG TGGAAGAACT 4 56 0 

TGGAGACGCC CTTGTGACCT CCAAGATTTT CCATGCATTC GTCCATAATG ATGGCAATGG 4620 

GCCCACGGGC GGCGGCCTGG GCGAAGATAT TTCTGGGATC ACTAACGTCA TAGTTGTGTT 4 680 



- 71 - 



wo 98/17783 

CCAGGATGAG 
GTATAATGGT 
CTTTGAGTTC 
GGGTAGGGGA 
CGGTGGGCCC 
TGCCGTCATC 
CCCTGACCAA 
CAAAGTTTTT 
GCAGTTCCAG 
CTCCTCGTTT 
ACGGGCCAGG 
GGTGAAGGGG 
GGTGCTGAAG 
GTCATAGTCC 
GCCGCACGAG 
TTCCGGGGAG 
GGTGAGCTCT 
CTTACCTCTG 
CCCGTATACA 
AAACTCGGAC 
GGAGGGGTAG 
GTCGCCCTCT 
TGTTCCTGAA 
ATCGCTGTCT 
TTCTGCGCTA 
GGTGATGCCT 
AAGCTTGGTG 
GGTTTGGTTT 
GCGCGCAACG 
GCGCCAACCG 
GCGCTCGTTG 
TAGCTGCGTC 
GTCGAAGTAG 
AAGCGCGCGC 
GGCGTACATG 



ATCGTCATAG 
TCCATCCGGC 
AGATGGGGGG 
GATCAGCTGG 
GTAAATCACA 
CCTGAGCAGG 
ATCCGCCAGA 
CAACGGTTTG 
GCGGTCCCAC 
CGCGGGTTGG 
GTCATGTCTT 
TGCGCTCCGG 
CGCTGCCGGT 
AGCCCCTCCG 
GGGCAGTGCA 
TAGGCATCCG 
GGCCGTTCGG 
GTTTCCATGA 
GACTTGAGAG 
CACTCTGAGA 
CGGTCGTTGT 
TCGGCATCAA 
GGGGGGCTAT 
GCGAGGGCCA 
AGATTGTCAG 
TTGAGGGTGG 
GCAAACGACC 
TTGTCGCGAT 
CACCGCCATT 
CGGTTGTGCA 
GTCCAGCAGA 
TCGTCCGGGG 
TCTATCTTGC 
TCGTATGGGT 
CCGCAAATGT 



GCCATTTTTA 
CCAGGGGCGT 
ATCATGTCTA 
GAAGAAAGCA 
CCTATTACCG 
GGGGCCACTT 
AGGCGCTCGC 
AGACCGTCCG 
AGCTCGGTCA 
GGCGGCTTTC 
TCCACGGGCG 
GCTGCGCGCT 
CTTCGCCCTG 
CGGCGTGGCC 
GACTTTTGAG 
CGCCGCAGGC 
GGTCAAAAAC 
GCCGGTGTCC 
GCCTGTCCTC 
CAAAGGCTCG 
CCACTAGGGG 
GGAAGGTGAT 
AAAAGGGGGT 
GCTGTTGGGG 
TTTCCAAAAA 
CCGCATCCAT 
CGTAGAGGGC 
CGGCGCGCTC 
CGGGAAAGAC 
GGGTGACAAG 
GGCGGCCGCC 
GGTCTGCGTC 
ATCCTTGCAA 
TGAGTGGGGG 
CGTAAACGTA 



CAAAGCGCGG 
AGTTACCCTC 
CCTGCGGGGC 
GGTTCCTGAG 
GGTGCAACTG 
CGTTAAGCAT 
CGCCCAGCGA 
CCGTAGGCAT 
CCTGCTCTAC 
GCTGTACGGC 
CAGGGTCCTC 
GGCCAGGGTG 
CGCGTCGGCC 
CTTGGCGCGC 
GGCGTAGAGC 
CCCGCAGACG 
CAGGTTTCCC 
ACGCTCGGTG 
GAGCGGTGTT 
CGTCCAGGCC 
GTCCACTCGC 
TGGTTTGTAG 
GGGGGCGCGT 
TGAGTACTCC 
CGAGGAGGAT 
CTGGTCAGAA 
GTTGGACAGC 
CTTGGCCGCG 
GGTGGTGCGC 
GTCAACGCTG 
CTTGCGCGAG 
CACGGTAAAG 
GTCTAGCGCC 
ACCCCATGGC 
GAGGGGCTCT 



GCGGAGGGTG 
ACAGATTTGC 
GATGAAGAAA 
CAGCTGCGAC 
GTAGTTAAGA 
GTCCCTGACT 
TAGCAGTTCT 
GCTTTTGAGC 
GGCATCTCGA 
AGTAGTCGGT 
GTCAGCGTAG 
CGCTTGAGGC 
AGGTAGCATT 
AGCTTGCCCT 
TTGGGCGCGA 
GTCTCGCATT 
CCATGCTTTT 
ACGAAAAGGC 
CCGCGGTCCT 
AGCACGAAGG 
TCCAGGGTGT 
GTGTAGGCCA 
TCGTCCTCAC 
CTCTGAAAAG 
TTGATATTCA 
AAGACAATCT 
AACTTGGCGA 
ATGTTTAGCT 
TCGTCGGGCA 
GTGGCTACCT 
CAGAATGGCG 
ACCCCGGGCA 
TGCTGCCATG 
ATGGGGTGGG 
CTGAGTATTC 
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CCAGACTGCG 4 74 0 

ATTTCCCACG 4 8 00 

ACGGTTTCCG 4 86 0 

TTACCGCAGC 4 92 0 

GAGCTGCAGC 4 980 

CGCATGTTTT 504 0 

TGCAAGGAAG 5100 

GTTTGACCAA 5160 

TCCAGCATAT 522 0 

GCTCGTCCAG 5280 

TCTGGGTCAC 5340 

TGGTCCTGCT 54 00 

TGACCATGGT 546 0 

TGGAGGAGGC 5520 

GAAATACCGA 5580 

CCACGAGCCA 5640 

TGATGCGTTT 5700 

TGTCCGTGTC 5760 

CCTCGTATAG 5820 

AGGCTAAGTG 5880 

GAAGACACAT 594 0 

CGTGACCGGG 6000 

TCTCTTCCGC 6 06 0 

CGGGCATGAC 612 0 

CCTGGCCCGC 6180 

TTTTGTTGTC 624 0 

TGGAGCGCAG 63 00 

GCACGTATTC 6360 

CCAGGTGCAC 6420 

CTCCGCGTAG 64 80 

GTAGGGGGTC 6540 

GCAGGCGCGC 66 00 

CGCGGGCGGC 6660 

TGAGCGCGGA 6720 

CAAGATATGT 6780 
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AGGGTAGCAT 
AGCGAGGAGG 
CCTGAAGATG 
GTCTGTGAGA 
CAGCTCGGCG 
ATACTTATCC 
TTTCCAGTAC 
GAACTGGTTG 
CGCGGCCTTC 
GTACTGGTAT 
GCGCTTTTTG 
CGCGCGAGGC 
AATTACCTGG 
AAGTTCCAAG 
GAGCTCTTCA 
GGAAGCGACG 
GGTCCTAAAC 
GTCTTGTTCC 
AGGCTCATCT 
CCCCATCCAA 
CGAGCCGATC 
GTGAAAGTAG 
GCAGTACTGG 
CACAAGGAAG 
TACTTCGGCT 
CACCACGCCG 
AACATCGCGC 
GAGCTCCTGC 
CCTAATTTCC 
CGGCGCGACT 
ATCTAAAAGC 
AGAGGGGGCA 
TTGCTGGCGA 
ACGACGGGCC 
TTGACGGCGG 



CTTCCACCGC 
TCGGGACCGA 
GCATGTGAGT 
CCTACCGCGT 
GTGACCTGCA 
TGTCCCTTTT 
TCTTGGATCG 
ACGGCCTGGT 
CGGAGCGAGG 
TTGAAGTCAG 
GAACGCGGAT 
ATAAAGTTGC 
GCGGCGAGCA 
AAGCGCGGGA 
GGGGAGCTGA 
AATGAGCTCC 
TGGCGACCTA 
CAGCGGTCCC 
CCGCCGAACT 
GTATAGGTCT 
GGGAAGAACT 
AAGTCCCTGC 
CAGCGGTGCA 
CAGAGTGGGA 
GCTTGTCCTT 
CGCGAGCCCA 
AGATGGGAGC 
AGGTTTACCT 
AGGGGCTGGT 
ACGGTACCGC 
GGTGACGCGG 
GGGGCACGTC 
ACGCGACGAC 
CGGTGAGCTT 
CCTGGCGCAA 



GGATGCTGGC 
GGTTGCTACG 
TGGATGATAT 
CACGCACGAA 
CGTCTAGGGC 
TTTTCCACAG 
GAAACCCGTC 
AGGCGCAGCA 
TGTGGGTGAG 
TGTCGTCGCA 
TTGGCAGGGC 
GTGTGATGCG 
CGATCTCGTC 
TGCCCTTGAT 
GCCCGTGCTC 
ACAGGTCACG 
TGGCCATTTT 
ATCCAAGGTT 
TCATGACCAG 
CTACATCGTA 
GGATCTCCCG 
GACGGGCCGA 
CGGGCTGTAC 
ATTTGAGCCC 
GACCGTCTGG 
AAGTCCAGAT 
TGTCCATGGT 
CGCATAGACG 
TGGTGGCGGC 
GCGGCGGGCG 
GCGAGCCCCC 
GGCGCCGCGC 
GCGGCGGTTG 
GAGCCTGAAA 
AATCTCCTGC 



GCGCACGTAA 
GGCGGGCTGC 
GGTTGGACGC 
GGAGGCGTAG 
GCAGTAGTCC 
CTCGCGGTTG 
GGCCTCCGAA 
TCCCTTTTCT 
CGCAAAGGTG 
TCCGCCCTGC 
GAAGGTGACA 
GAAGGGTCCC 
7UVAGCCGTTG 
GGAAGGCAAT 
TGAAAGGGCC 
GGCCATTAGC 
TTCTGGGGTG 
CGCGGCTAGG 
CATGAAGGGC 
GGTGACAAAG 
CCACCAATTG 
ACACTCGTGC 
ATCCTGCACG 
CTCGCCTGGC 
CTGCTCGAGG 
GTCCGCGCGC 
CTGGAGCTCC 
GGTCAGGGCG 
GTCGATGGCT 
GTGGGCCGCG 
GGAGGTAGGG 
GCGGGCAGGA 
ATCTCCTGAA 
GAGAGTTCGA 
ACGTCTCCTG 



TCGTATAGTT 
TCTGCTCGGA 
TGGAAGACGT 
GAGTCGCGCA 
AGGGTTTCCT 
AGGACAAACT 
CGGTAAGAGC 
ACGGGTAGCG 
TCCCTGACCA 
TCCCAGAGCA 
TCGTTGAAGA 
GGCACCTCGG 
ATGTTGTGGC 
TTTTTAAGTT 
CAGTCTGCAA 
ATTTGCAGGT 
ATGCAGTAGA 
TCTCGCGCGG 
ACGAGCTGCT 
AGACGCTCGG 
GAGGAGTGGC 
TGGCTTTTGT 
AGGTTGACCT 
GGGTTTGGCT 
GGAGTTACGG 
GGCGGTCGGA 
CGCGGCGTCA 
CGGGCTAGAT 
TGCAAGAGGC 
GGGGTGTCCT 
GGGGCTCCGG 
GCTGGTGCTG 
TCTGGCGCCT 
CAGAATCAAT 
AGTTGTCTTG 
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CGTGCGAGGG 6 84 0 

AGACTATCTG 6 90 0 

TGAAGCTGGC 6 960 

GCTTGTTGAC 70 2 0 

TGATGATGTC 70 8 0 

CTTCGCGGTC 714 0 

CTAGCATGTA 72 00 

CGTATGCCTG 726 0 

TGACTTTGAG 73 2 0 

AAAAGTCCGT 73 80 

GTATCTTTCC 744 0 

AACGGTTGTT 7500 

CCACAATGTA 7 5 60 

CCTCGTAGGT 762 0 

GATGAGGGTT 7680 

GGTCGCGAAA 774 0 

AGGTAAGCGG 7 8 00 

CAGTCACTAG 7860 

TCCCAAAGGC 7 920 

TGCGAGGATG 798 0 

TATTGATGTG 8 040 

AAAAACGTGC 8100 

GACGACCGCG 8160 

GGTGGTCTTC 82 20 

TGGATCGGAC 828 0 

GCTTGATGAC 8340 

GGTCAGGCGG 3400 

CCAGGTGATA 846 0 

CGCATCCCCG 8 520 

TGGATGATGC 85 80 

ACCCGCCGGG 8640 

CGCGCGTAGG 8 7 00 

CTGCGTGAAG 8760 

TTCGGTGTCG 8 8 20 

ATAGGCGATC 88 80 
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TCGGCCATGA 
GTGGCGGCGA 
TCGTTCCAGA 
TGCGCGAGAT 
AGGTAGTTGA 
AACGTGGATT 
ACGGCGAAGT 
CGGATGAGCT 
TCTTCTTCAA 
GGAGGGGGGA 
ATCTCCCCGC 
AGTTGGAAGA 
AGGGATACGG 
GACCTGAGCG 
TCACAGTCGC 
TTTCTGGCGG 
GTCGACAGAA 
CCCCAGGCTT 
ACCGGCACTT 
GCGGCGGAGT 
CTCATCGGCT 
ACCTGCGTGA 
TTGATGGTGT 
GAGAGCTCGG 
GTCCGCACCA 
CAGCGTAGGG 
TAGATGTACC 
CGGACGCGGT 
CCGGTCAGGC 
GGCACTCTTC 
TCGAGCCCCG 
CAGGTGTGCG 
GCTGCTGCGC 
GCGAAAGCAT 
GCGGGACCCC 



ACTGCTCGAT 
GGTCGTTGGA 
CGCGGCTGTA 
TGAGCTCCAC 
GGGTGGTGGC 
CGTTGATATC 
TGAAAAACTG 
CGGCGACAGT 
TCTCCTCTTC 
CACGGCGGCG 
GGCGACGGCG 
CGCCGCCCGT 
CGCTAACGAT 
AGTCCGCATC 
AAGGTAGGCT 
AGGTGCTGCT 
GCACCATGTC 
CGTTTTGACA 
CTTCTTCTCC 
TTGGCCGTAG 
GAAGCAGGGC 
GGGTAGACTG 
AAGTGCAGTT 
TGTACCTGAG 
GGTACTGGTA 
TGGCCGGGGC 
TGGACATCCA 
TCCAGATGTT 
GCGCGCAATC 
CGTGGTCTGG 
TATCCGGCCG 
ACGTCAGACA 
TAGCTTTTTT 
TAAGTGGCTC 
CGGTTCGAGT 



CTCTTCCTCC 
AATGCGGGCC 
GACCACGCCC 
GTGCCGGGCG 
GGTGTGTTCT 
CCCCAAGGCC 
GGAGTTGCGC 
GTCGCGCACC 
CATAAGGGCC 
ACGACGGCGC 
CATGGTCTCG 
CATGTCCCGG 
GCATCTCAAC 
GACCGGATCG 
GAGCACCGTG 
GATGATGTAA 
CTTGGGTCCG 
TCGGCGCAGG 
TTCCTCTTGT 
GTGGCGCCCT 
TAGGTCGGCG 
GAAGTCATCC 
GGCCATAACG 
ACGCGAGTAA 
TCCCACCAAA 
TCCGGGGGCG 
GGTGATGCCG 
GCGCAGCGGC 
GTTGACGCTC 
TGGATAAATT 
TCCGCCGTGA 
ACGGGGGAGT 
GGCCACTGGC 
GCTCCCTGTA 
CTCGGACCGG 



TGGAGATCTC 
ATGAGCTGCG 
CCTTCGGCAT 
AAGACGGCGT 
GCCACGAAGA 
TCAAGGCGCT 
GCCGACACGG 
TCGCGCTCAA 
TCCCCTTCTT 
ACCGGGAGGC 
GTGACGGCGC 
TTATGGGTTG 
AATTGTTGTG 
GAAAACCTCT 
GCGGGCGGCA 
TTAAAGTAGG 
GCCTGCTGAA 
TCTTTGTAGT 
CCTGCATCTC 
CTTCCTCCCA 
ACAACGCGCT 
ATGTCCACAA 
GACCAGTTAA 
GCCCTCGAGT 
AAGTGCGGCG 
AGATCTTCCA 
GCGGCGGTGG 
AAAAAGTGCT 
TAGACCGTGC 
CGCAAGGGTA 
TCCATGCGGT 
GCTCCTTTTG 
CGCGCGCAGC 
GCCGGAGGGT 
CCGGACTGCG 



CGCGTCCGGC 
AGAAGGCGTT 
CGCGGGCGCG 
AGTTTCGCAG 
AGTACATAAC 
CCATGGCCTC 
TTAACTCCTC 
AGGCTACAGG 
CTTCTTCTGG 
GGTCGACAAA 
GGCCGTTCTC 
GCGGGGGGCT 
TAGGTACTCC 
CGAGAAAGGC 
GCGGGCGGCG 
CGGTCTTGAG 
TGCGCAGGCG 
AGTCTTGCAT 
TTGCATCTAT 
TGCGTGTGAC 
CGGCTAATAT 
AGCGGTGGTA 
CGGTCTGGTG 
CAAATACGTA 
GCGGCTGGCG 
ACATAAGGCG 
TGGAGGCGCG 
CCATGGTCGG 
AAAAGGAGAG 
TCATGGCGGA 
TACCGCCCGC 
GCTTCCTTCC 
GTAAGCGGTT 
TATTTTCCAA 
GCGAACGGGG 
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TCGCTCCACG 8 94 0 

GAGGCCTCCC 9000 

CATGACCACC 906 0 

GCGCTGAAAG 912 0 

CCAGCGTCGC 918 0 

GTAGAAGTCC 924 0 

CTCCAGAAGA 93 00 

GGCCTCTTCT 9 36 0 

CGGCGGTGGG 942 0 

GCGCTCGATC 948 0 

GCGGGGGCGC 954 0 

GCCATGCGGC 960 0 

GCCGCCGAGG 9660 

GTCTAACCAG 97 2 0 

GTCGGGGTTG 9 78 0 

ACGGCGGATG 984 0 

GTCGGCCATG 9900 

GAGCCTTTCT 9960 

CGCTGCGGCG 10 020 

CCCGAAGCCC 10080 

GGCCTGCTGC 1014 0 

TGCGCCCGTG 10200 

ACCCGGCTGC 10260 

GTCGTTGCAA 10320 

GTAGAGGGGC 10380 

ATGATATCCG 10440 

CGGAAAGTCG 10500 

GACGCTCTGG 1056 0 

CCTGTAAGCG 10620 

CGACCGGGGT 10680 

GTGTCGAACC 10740 

AGGCGCGGCG 10800 

AGGCTGGAAA 10860 

GGGTTGAGTC 10920 

GTTTGCCTCC 10980 
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CCGTCATGCA 


AGACCCCGCT 


TGCAAATTCC 


TCCGGAAACA 


GGGACGAGCC 


CCTTTTTTGC 


11040 


TTTTCCCAGA 


TGCATCCGGT 


GCTGCGGCAG 


ATGCGCCCCC 


CTCCTCAGCA 


GCGGCAAGAG 


11100 


CAAGAGCAGC 


GGCAGACATG 


CAGGGCACCC 


TCCCCTCCTC 


CTACCGCGTC 


AGGAGGGGCG 


11160 


ACATCCGCGG 


TTGACGCGGC 


AGCAGATGGT 


GATTACGAAC 


CCCCGCGGCG 


CCGGGCCCGG 


11220 


CACTACCTGG 


ACTTGGAGGA 


GGGCGAGGGC 


CTGGCGCGGC 


TAGGAGCGCC 


CTCTCCTGAG 


11280 


CGGTACCCAA 


GGGTGCAGCT 


GAAGCGTGAT 


ACGCGTGAGG 


CGTACGTGCC 


GCGGCAGAAC 


11340 


CTGTTTCGCG 


ACCGCGAGGG 


AGAGGAGCCC 


GAGGAGATGC 


GGGATCGAAA 


GTTCCACGCA 


11400 


GGGCGCGAGC 


TGCGGCATGG 


CCTGAATCGC 


GAGCGGTTGC 


TGCGCGAGGA 


GGACTTTGAG 


11460 


CCCGACGCGC 


GAACCGGGAT 


TAGTCCCGCG 


CGCGCACACG 


TGGCGGCCGC 


CGACCTGGTA 


11520 


ACCGCATACG 


AGCAGACGGT 


GAACCAGGAG 


ATTAACTTTC 


AAAAAAGCTT 


TAACAACCAC 


11580 


GTGCGTACGC 


TTGTGGCGCG 


CGAGGAGGTG 


GCTATAGGAC 


TGATGCATCT 


GTGGGACTTT 


11640 


GTAAGCGCGC 


TGGAGCAAAA 


CCCAAATAGC 


AAGCCGCTCA 


TGGCGCAGCT 


GTTCCTTATA 


11700 


GTGCAGCACA 


GCAGGGACAA 


CGAGGCATTC 


AGGGATGCGC 


TGCTAAACAT 


AGTAGAGCCC 


11760 


GAGGGCCGCT 


GGCTGCTCGA 


TTTGATAAAC 


ATCCTGCAGA 


GCATAGTGGT 


GCAGGAGCGC 


11820 


AGCTTGAGCC 


TGGCTGACAA 


GGTGGCCGCC 


ATCAACTATT 


CCATGCTTAG 


CCTGGGCAAG 


11880 


TTTTACGCCC 


GCAAGATATA 


CCATACCCCT 


TACGTTCCCA 


TAGACAAGGA 


GGTAAAGATC 


11940 


GAGGGGTTCT 


ACATGCGCAT 


GGCGCTGAAG 


GTGCTTACCT 


TGAGCGACGA 


CCTGGGCGTT 


12000 


TATCGCAACG 


AGCGCATCCA 


CAAGGCCGTG 


AGCGTGAGCC 


GGCGGCGCGA 


GCTCAGCGAC 


12060 


CGCGAGCTGA 


TGCACAGCCT 


GCAAAGGGCC 


CTGGCTGGCA 


CGGGCAGCGG 


CGATAGAGAG 


12120 


GCCGAGTCCT 


ACTTTGACGC 


GGGCGCTGAC 


CTGCGCTGGG 


CCCCAAGCCG 


ACGCGCCCTG 


12180 


GAGGCAGCTG 


GGGCCGGACC 


TGGGCTGGCG 


GTGGCACCCG 


CGCGCGCTGG 


CAACGTCGGC 


12240 


GGCGTGGAGG 


AATATGACGA 


GGACGATGAG 


TACGAGCCAG 


AGGACGGCGA 


GTACTAAGCG 


12300 


GTGATGTTTC 


TGATCAGATG 


ATGCAAGACG 


CAACGGACCC 


GGCGGTGCGG 


GCGGCGCTGC 


12360 


AGAGCCAGCC 


GTCCGGCCTT 


AACTCCACGG 


ACGACTGGCG 


CCAGGTCATG 


GACCGCATCA 


12420 


TGTCGCTGAC 


TGCGCGCAAT 


CCTGACGCGT 


TCCGGCAGCA 


GCCGCAGGCC 


AACCGGCTCT 


12480 


CCGCAATTCT 


GGAAGCGGTG 


GTCCCGGCGC 


GCGCAAACCC 


CACGCACGAG 


AAGGTGCTGG 


12540 


CGATCGTAAA 


CGCGCTGGCC 


GAAAACAGGG 


CCATCCGGCC 


CGACGAGGCC 


GGCCTGGTCT 


12600 


ACGACGCGCT 


GCTTCAGCGC 


GTGGCTCGTT 


ACAACAGCGG 


CAACGTGCAG 


ACCAACCTGG 


12660 


ACCGGCTGGT 


GGGGGATGTG 


CGCGAGGCCG 


TGGCGCAGCG 


TGAGCGCGCG 


CAGCAGCAGG 


12720 


GCAACCTGGG 


CTCCATGGTT 


GCACTAAACG 


CCTTCCTGAG 


TACACAGCCC 


GCCAACGTGC 


12780 


CGCGGGGACA 


GGAGGACTAC 


ACCAACTTTG 


TGAGCGCACT 


GCGGCTAATG 


GTGACTGAGA 


12840 


CACCGCAAAG 


TGAGGTGTAC 


CAGTCTGGGC 


CAGACTATTT 


TTTCCAGACC 


AGTAGACAAG 


12900 


GCCTGCAGAC 


CGTAAACCTG 


AGCCAGGCTT 


TCAAAAACTT 


GCAGGGGCTG 


TGGGGGGTGC 


1296D 


GGGCTCCCAC 


AGGCGACCGC 


GCGACCGTGT 


CTAGCTTGCT 


GACGCCCAAC 


TCGCGCCTGT 


13020 


TGCTGCTGCT 


AATAGCGCCC 


TTCACGGACA 


GTGGCAGCGT 


GTCCCGGGAC 


ACATACCTAG 


13080 
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GTCACTTGCT 
TCCAGGAGAT 
CAACCCTAAA 
ACAGCGAGGA 
GCGACGGGGT 
TGTATGCCTC 
CCGTGAACCC 
GTTTCTACAC 
TAGACGACAG 
AGGCAGAGGC 
GCGCTGCGGC 
CCAGCACTCG 
TGCTGCAGCC 
GCCTAGTGGA 
GCCCGCGCCC 
ACGATGACTC 
CGCACCTTCG 
AAAAACTCAC 
GCGCGCGGCG 
GCCAGTGGCG 
TCCGCGGTAC 
CCTATTCGAC 
GAACTACCAG 
CCCGGGGGAG 
CCTGAAAACC 
GTTTAAGGCG 
ATACGAGTGG 
CCTTATGAAC 
GGAAAGCGAC 
CACTGGTCTT 
GCTGCCAGGA 
CAAGCGGCAA 
CATTCCCGCA 
GGGCGGGGGT 
CGCGGCAGCC 



GACACTGTAC 
TACAAGTGTC 
CTACCTGCTG 
GGAGCGCATT 
AACGCCCAGC 
AAACCGGCCG 
CGAGTATTTC 
CGGGGGATTC 
CGTGTTTTCC 
GGCGCTGCGA 
CCCGCGGTCA 
CACCACCCGC 
GCAGCGCGAA 
CAAGATGAGT 
GCCCACCCGT 
GGCAGACGAC 
CCCCAGGCTG 
CAAGGCCATG 
ATGTATGAGG 
GCGGCGCTGG 
CTGCGGCCTA 
ACCACCCGTG 
AACGACCACA 
GCAAGCACAC 
ATCCTGCATA 
CGGGTGATGG 
GTGGAGTTCA 
AACGCGATCG 
ATCGGGGTAA 
GTCATGCCTG 
TGCGGGGTGG 
CCCTTCCAGG 
CTGTTGGATG 
GGCGCAGGCG 
GCGGCAATGC 



CGCGAGGCCA 
AGCCGCGCGC 
ACCAACCGGC 
TTGCGCTACG 
GTGGCGCTGG 
TTTATCAACC 
ACCAATGCCA 
GAGGTGCCCG 
CCGCAACCGC 
AAGGAAAGCT 
GATGCTAGTA 
CCGCGCCTGC 
AAAAACCTGC 
AGATGGAAGA 
CGTCAAAGGC 
AGCAGCGTCC 
GGGAGAATGT 
GCACCGAGCG 
AAGGTCCTCC 
GTTCTCCCTT 
CCGGGGGGAG 
TGTACCTGGT 
GCAACTTTCT 
AGACCATCAA 
CCAACATGCC 
TGTCGCGCTT 
CGCTGCCCGA 
TGGAGCACTA 
AGTTTGACAC 
GGGTATATAC 
ACTTCACCCA 
AGGGCTTTAG 
TGGACGCCTA 
GCAGCAACAG 
AGCCGGTGGA 



TAGGTCAGGC 
TGGGGCAGGA 
GGCAGAAGAT 
TGCAGCAGAG 
ACATGACCGC 
GCCTAATGGA 
TCTTGAACCC 
AGGGTAACGA 
AGACCCTGCT 
TCCGCAGGCC 
GCCCATTTCC 
TGGGCGAGGA 
CTCCGGCATT 
CGTACGCGCA 
ACGACCGTCA 
TGGATTTGGG 
TTTAAAAAAA 
TTGGTTTTCT 
TCCCTCCTAC 
CGATGCTCCC 
AAACAGCATC 
GGACAACAAG 
GACCACGGTC 
TCTTGACGAC 
AAATGTGAAC 
GCCTACTAAG 
GGGCAACTAC 
CTTGAAAGTG 
CCGCAACTTC 
AAACGAAGCC 
CAGCCGCCTG 
GATCACCTAC 
CCAGGCGAGC 
CAGTGGCAGC 
GGACATGAAC 



GCATGTGGAC 
GGACACGGGC 
CCCCTCGTTG 
CGTGAGCCTT 
GCGCAACATG 
CTACTTGCAT 
GCACTGGCTA 
TGGATTCCTC 
AGAGTTGCAA 
AAGCAGCTTG 
AAGCTTGATA 
GGAGTACCTA 
TCCCAACAAC 
GGAGCACAGG 
GCGGGGTCTG 
AGGGAGTGGC 
AAAAAGCATG 
TGTATTCCCC 
GAGAGTGTGG 
CTGGACCCGC 
CGTTACTCTG 
TCAACGGATG 
ATTCAAAACA 
CGGTCGCACT 
GAGTTCATGT 
GACAATCAGG 
TCCGAGACCA 
GGCAGACAGA 
AGACTGGGGT 
TTCCATCCAG 
AGCAACTTGT 
GATGATCTGG 
TTGAAAGATG 
GGCGCGGAAG 
GATCATGCCA 
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GAGCATACTT 1314 0 

AGCCTGGAGG 132 00 

CACAGTTTAA 13 260 

AACCTGATGC 1332 0 

GAACCGGGCA 13 3 80 

CGCGCGGCCG 1344 0 

CCGCCCCCTG 13 500 

TGGGACGACA 13 560 

CAGCGCGAGC 13 620 

TCCGATCTAG 13 6 80 

GGGTCTCTTA 13 740 

AACAACTCGC 13 800 

GGGATAGAGA 13 860 

GACGTGCCAG 13 920 

GTGTGGGAGG 13 980 

AACCCGTTTG 14 040 

ATGCAAAATA 14100 

TTAGTATGCG 1416 0 

TGAGCGCGGC 14220 

CGTTTGTGCC 1428 0 

AGTTGGCACC 1434 0 

TGGCATCCCT 1440 0 

ATGACTACAG 14460 

GGGGCGGCGA 14 520 

TTACCAATAA 14 580 

TGGAGCTGAA 14640 

TGACC ATAGA 14 700 

ACGGGGTTCT 14 76 0 

TTGACCCCGT 14 820 

ACATCATTTT 14 880 

TGGGCATCCG 14 940 

AGGGTGGTAA 15000 

ACACCGAACA 15060 

AGAACTCCAA 15120 

TTCGCGGCGA 15180 



- 76 - 



wo 98/17783 

CACCTTTGCC ACACGGGCTG AGGAGAAGCG 
CGCCCCCGCT GCGCAACCCG AGGTCGAGAA 
GACAGAGGAC AGCAAGAAAC GCAGTTACAA 
GTACCGCAGC TGGTACCTTG CATACAACTA 
GACCCTGCTT TGCACTCCTG ACGTAACCTG 
AGACATGATG CAAGACCCCG TGACCTTCCG 
GGTGGGCGCC GAGCTGTTGC CCGTGCACTC 
CTCCCAACTC ATCCGCCAGT TTACCTCTCT 
CCAGATTTTG GCGCGCCCGC CAGCCCCCAC 
TCTCACAGAT CACGGGACGC TACCGCTGCG 
CATTACTGAC GCCAGACGCC GCACCTGCCC 
GCCGCGCGTC CTATCGAGCC GCACTTTTTG 
CAATAACACA GGCTGGGGCC TGCGCTTCCC 
CTCCGACCAA CACCCAGTGC GCGTGCGCGG 
ACGCGGCCGC ACTGGGCGCA CCACCGTCGA 
GCGCAACTAC ACGCCCACGC CGCCACCAGT 
GGTGCGCGGA GCCCGGCGCT ATGCTAAAAT 
CCACCGCCGC CGACCCGGCA CTGCCGCCCA 
ACGTCGCACC GGCCGACGGG CGGCCATGCG 
CACTGTGCCC CCCAGGTCCA GGCGACGAGC 
TATGACTCAG GGTCGCAGGG GCAACGTGTA 
CGTGCCCGTG CGCACCCGCC CCCCGCGCAA 
GTACTGTTGT ATGTATCCAG CGGCGGCGGC 
CAAAGAAGAG ATGCTCCAGG TCATCGCGCC 
GCAGGATTAC AAGCCCCGAA AGCTAAAGCG 
TGAACTTGAC GACGAGGTGG AACTGCTGCA 
GAAAGGTCGA CGCGTAAAAC GTGTTTTGCG 
TGAGCGCTCC ACCCGCACCT ACAAGCGCGT 
GCTTGAGCAG GCCAACGAGC GCCTCGGGGA 
GCTGGCGTTG CCGCTGGACG AGGGCAACCC 
GCAGGTGCTG CCCGCGCTTG CACCGTCCGA 
TGACTTGGCA CCCACCGTGC AGCTGATGGT 
GGAAAAAATG ACCGTGGAAC CTGGGCTGGA 
GGTGGCGCCG GGACTGGGCG TGCAGACCGT 
CAGTATTGCC ACCGCCACAG AGGGCATGGA 



PCT/US97/19541 

CGCTGAGGCC GAAGCAGCGG CCGAAGCTGC 1524 0 

GCCTCAGAAG AAACCGGTGA TCAAACCCCT 153 0 0 

CCTAATAAGC AATGACAGCA CCTTCACCCA 1536 0 

CGGCGACCCT CAGACCGGAA TCCGCTCATG 1542 0 

CGGCTCGGAG CAGGTCTACT GGTCGTTGCC 15480 

CTCCACGCGC CAGATCAGCA ACTTTCCGGT 1554 0 

CAAGAGCTTC TACAACGACC AGGCCGTCTA 15600 

GACCCACGTG TTCAATCGCT TTCCCGAGAA 156 6 0 

CATCACCACC GTCAGTGAAA ACGTTCCTGC 15720 

CAACAGCATC GGAGGAGTCC AGCGAGTGAC 15 78 0 

CTACGTTTAC AAGGCCCTGG GCATAGTCTC 1584 0 

AGCAAGCATG TCCATCCTTA TATCGCCCAG 15 90 0 

AAGCAAGATG TTTGGCGGGG CCAAGAAGCG 15 96 0 

GCACTACCGC GCGCCCTGGG GCGCGCACAA 1602 0 

TGACGCCATC GACGCGGTGG TGGAGGAGGC 16 080 

GTCCACAGTG GACGCGGCCA TTCAGACCGT 1614 0 

GAAGAGACGG CGGAGGCGCG TAGCACGTCG 162 00 

ACGCGCGGCG GCGGCCCTGC TTAACCGCGC 1626 0 

GGCCGCTCGA AGGCTGGCCG CGGGTATTGT 1632 0 

GGCCGCCGCA GCAGCCGCGG CCATTAGTGC 163 80 

TTGGGTGCGC GACTCGGTTA GCGGCCTGCG 1644 0 

CTAGATTGCA AGAAAAAACT ACTTAGACTC 16500 

GCGCAACGAA GCTATGTCCA AGCGCAAAAT 16560 

GGAGATCTAT GGCCCCCCGA AGAAGGAAGA 1662 0 

GGTCAAAAAG AAAAAGAAAG ATGATGATGA 16680 

CGCTACCGCG CCCAGGCGAC GGGTACAGTG 16 740 

ACCCGGCACC ACCGTAGTCT TTACGCCCGG 16800 

GTATGATGAG GTGTACGGCG ACGAGGACCT 168 6 0 

GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 16 92 0 

AACACCTAGC CTAAAGCCCG TAACACTGCA 16 98 0 

AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 17040 

ACCCAAGCGC CAGCGACTGG AAGATGTCTT 17100 

GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 1716 0 

GGACGTTCAG ATACCCACTA CCAGTAGCAC 17220 

GACACAAACG TCCCCGGTTG CCTCAGCGGT 17280 
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GGCGGATGCC GCGGTGCAGG CGGTCGCTGC 
AACGGACCCG TGGATGTTTC GCGTTTCAGC 
CGGCGCCGCC AGCGCGCTAC TGCCCGAATA 
CGGCTATCGT GGCTACACCT ACCGCCCCAG 
CACTGGAACC CGGCGCCGCC GTCGCCGTCG 
CAGGGTGGCT CGCGAAGGAG GCAGGACCCT 
CATCGTTTAA AAGCCGGTCT TTGTGGTTCT 
TTTCCCGGTG CCGGGATTCC GAGGAAGAAT 
CCTGACGGGC GGCATGCGTC GTGCGCACCA 
GCGCGGCGGT ATCCTGCCCC TCCTTATTCC 
CGGAATTGCA TCCGTGGCCT TGCAGGCGCA 
GAAAAATCAA AATAAAAAGT CTGGACTCTC 
GAATGGAAGA CATCAACTTT GCGTCTCTGG 
GAAACTGGCA AGATATCGGC ACCAGCAATA 
TGTGGAGCGG CATTAAAAAT TTCGGTTCCA 
ACAGCAGCAC AGGCCAGATG CTGAGGGATA 
TGGTAGATGG CCTGGCCTCT GGCATTAGCG 
AAAATAAGAT TAACAGTAAG CTTGATCCCC 
TGGAGACAGT GTCTCCAGAG GGGCGTGGCG 
CTCTGGTGAC GCAAATAGAC GAGCCTCCCT 
CCACCACCCG TCCCATCGCG CCCATGGCTA 
CGCTGGACCT GCCTCCCCCC GCCGACACCC 
CCGTTGTTGT AACCCGTCCT AGCCGCGCGT 
CGTTGCGGCC CGTAGCCAGT GGCAACTGGC 
GGGTGCAATC CCTGAAGCGC CGACGATGCT 
ATGTATGCGT CCATGTCGCC GCCAGAGGAG 
GATGGCTACC CCTTCGATGA TGCCGCAGTG 
CTCGGAGTAC CTGAGCCCCG GGCTGGTGCA 
CCTGAATAAC AAGTTTAGAA ACCCCACGGT 
GTCCCAGCGT TTGACGCTGC GGTTCATCCC 
CAAGGCGCGG TTCACCCTAG CTGTGGGTGA 
CTTTGACATC CGCGGCGTGC TGGACAGGGG 
CTACAACGCC CTGGCTCCCA AGGGTGCCCC 
TGCTCTTGAA ATAAACCTAG AAGAAGAGGA 
AGCTGAGCAG CAAAAAACTC ACGTATTTGG 
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GGCCGCGTCC 


AAGACCTCTA 


CGGAGGTGCA 


17340 


CCCCCGGCGC 


CCGCGCGGTT 


CGAGGAAGTA 


•174 0 0 


TGCCCTACAT 


CCTTCCATTG 


CGCCTACCCC 


17460 


AAGACGAGCA 


ACTACCCGAC 


GCCGAACCAC 


17520 


CCAGCCCGTG 


CTGGCCCCGA 


TTTCCGTGCG 


17580 


GGTGCTGCCA 


ACAGCGCGCT 


ACCACCCCAG 


17640 


TGCAGATATG 


GCCCTCACCT 


GCCGCCTCCG 


17700 


GCACCGTAGG 


AGGGGCATGG 


CCGGCCACGG 


17760 


CCGGCGGCGG 


CGCGCGTCGC 


ACCGTCGCAT 


17820 


ACTGATCGCC 


GCGGCGATTG 


GCGCCGTGCC 


17880 


GAGACACTGA 


TTAAAAACAA 


GTTGCATGTG 


17940 


ACGCTCGCTT 


GGTCCTGTAA 


CTATTTTGTA 


18000 


CCCCGCGACA 


CGGCTCGCGC 


CCGTTCATGG 


18060 


TGAGCGGTGG 


CGCCTTCAGC 


TGGGGCTCGC 


18120 


CCGTTAAGAA 


CTATGGCAGC 


AAGGCCTGGA 


18180 


AGTTGAAAGA 


GCAAAATTTC 


CAACAAAAGG 


18240 


GGGTGGTGGA 


CCTGGCCAAC 


CAGGCAGTGC 


18300 


GCCCTCCCGT 


AGAGGAGCCT 


CCACCGGCCG 


18360 


AAAAGCGTCC 


GCGCCCCGAC 


AGGGAAGAAA 


18420 


CGTACGAGGA 


GGCACTAAAG 


CAAGGCCTGC 


18480 


CCGGAGTGCT 


GGGCCAGCAC 


ACACCCGTAA 


18540 


AGCAGAAACC 


TGTGCTGCCA 


GGCCCGACCG 


18600 


CCCTGCGCCG 


CGCCGCCAGC 


GGTCCGCGAT 


18660 


AAAGCACACT 


GAACAGCATC 


GTGGGTCTGG 


18720 


TCTGAATAGC 


TAACGTGTCG 


TATGTGTGTC 


18780 


CTGCTGAGCC 


GCCGCGCGCC 


CGCTTTCCAA 


18840 


GTCTTACATG 


CACATCTCGG 


GCCAGGACGC 


18900 


GTTTGCCCGC 


GCCACCGAGA 


CGTACTTCAG 


18960 


GGCGCCTACG 


CACGACGTGA 


CCACAGACCG 


19020 


TGTGGACCGT 


GAGGATACTG 


CGTACTCGTA 


19080 


TAACCGTGTG 


CTGGACATGG 


CTTCCACGTA 


19140 


CCCTACTTTT 


AAGCCCTACT 


CTGGCACTGC 


19200 


AAATCCTTGC 


GAATGGGATG 


AAGCTGCTAC 


19260 


CGATGACAAC 


GAAGACGAAG 


TAGACGAGCA 


19320 


GCAGGCGCCT 


TATTCTGGTA 


TAAATATTAC 


19380 
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AAAGGAGGGT 

TCAACCTGAA 
TGGGAGAGTC 
CACAAATGAA 
TCAAGTGGAA 
GACTCCTAAA 
TTCTTACATG 
GCCCAACAGG 
CAGCACGGGT 
TTTGCAAGAC 
AACCAGGTAC 
TATTGAAAAT 
GATTAATACA 
AAAAGATGCT 
GGAAATCAAT 
TTTGCCCGAC 
CTACGACTAC 
TGGAGCACGC 
TGCTGGCCTG 
CCAGGTGCCT 
CTACGAGTGG 
CCTAAGGGTT 
CCCCATGGCC 
CCAGTCCTTT 
TACCAACGTG 
CACGCGCCTT 
CTACTCTGGC 
GGTGGCCATT 
CAACGAGTTT 
CATGACCAAA 
CTTCTATATC 
CATGAGCCGT 
ACACCAACAC 
GGCCTACCCT 
CCAGAAAAAG 



ATTCAAATAG 
CCTCAAATAG 
CTTAAAAAGA 
AATGGAGGGC 
ATGCAATTTT 
GTGGTATTGT 
CCCACTATTA 
CCTAATTACA 
AATATGGGTG 
AGAAACACAG 
TTTTCTATGT 
CATGGAACTG 
GAGACTCTTA 
ACAGAATTTT 
CTAAATGCCA 
AAGCTAAAGT 
ATGAACAAGC 
TGGTCCCTTG 
CGCTACCGCT 
CAGAAGTTCT 
AACTTCAGGA 
GACGGAGCCA 
CACAACACCG 
AACGACTATC 
CCCATATCCA 
AAGACTAAGG 
TCTATACCCT 
ACCTTTGACT 
GAAATTAAGC 
GACTGGTTCC 
CCAGAGAGCT 
CAGGTGGTGG 
AACAACTCTG 
GCTAACTTCC 
TTTCTTTGCG 



GTGTCGAAGG 
GAGAATCTCA 
CTACCCCAAT 
AAGGCATTCT 
TCTCAACTAC 
ACAGTGAAGA 
AGGAAGGTAA 
TTGCTTTTAG 
TTCTGGCGGG 
AGCTTTCATA 
GGAATCAGGC 
AAGATGAACT 
CCAAGGTAAA 
CAGATAAAAA 
ACCTGTGGAG 
ACAGTCCTTC 
GAGTGGTGGC 
ACTATATGGA 
CAATGTTGCT 
TTGCCATTAA 
AGGATGTTAA 
GCATTAAGTT 
CCTCCACGCT 
TCTCCGCCGC 
TCCCCTCCCG 
AAACCCCATC 
ACCTAGATGG 
CTTCTGTCAG 
GCTCAGTTGA 
TGGTACAAAT 
ACAAGGACCG 
ATGATACTAA 
GATTTGTTGG 
CCTATCCGCT 
ATCGCACCCT 



TCAAACACCT 
GTGGTACGAA 
GAAACCATGT 
TGTAAAGCAA 
TGAGGCGACC 
TGTAGATATA 
CTCACGAGAA 
GGACAATTTT 
CCAAGCATCG 
CCAGCTTTTG 
TGTTGACAGC 
TCCAAATTAC 
ACCTAAAACA 
TGAAATAAGA 
AAATTTCCTG 
CAACGTAAAA 
TCCCGGGTTA 
CAACGTCAAC 
GGGCAATGGT 
AAACCTCCTT 
CATGGTTCTG 
TGATAGCATT 
TGAGGCCATG 
CAACATGCTC 
CAACTGGGCG 
ACTGGGCTCG 
AACCTTTTAC 
CTGGCCTGGC 
CGGGGAGGGT 
GCTAGCTAAC 
CATGTACTCC 
ATACAAGGAC 
CTACCTTGCC 
TATAGGCAAG 
TTGGCGCATC 



AAATATGCCG 
ACTGAAATTA 
TACGGTTCAT 
CAAAATGGAA 
GCAGGCAATG 
GAAACCCCAG 
CTAATGGGCC 
ATTGGTCTAA 
CAGTTGAATG 
CTTGATTCCA 
TATGATCCAG 
TGCTTTCCAC 
GGTCAGGAAA 
GTTGGAAATA 
TACTCCAACA 
ATTTCTGATA 
GTGGACTGCT 
CCATTTAACC 
CGCTATGTGC 
CTCCTGCCGG 
CAGAGCTCCC 
TGCCTTTACG 
CTTAGAAACG 
TACCCTATAC 
GCTTTCCGCG 
GGCTACGACC 
CTCAACCACA 
AATGACCGCC 
TACAACGTTG 
TACAACATTG 
TTCTTTAGAA 
TACCAACAGG 
CCCACCATGC 
ACCGCAGTTG 
CCATTCTCCA 
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ATAAAACATT 19440 

ATCATGCAGC 19 500 

ATGCAAAACC 19560 

AGCTAGAAAG 19620 

GTGATAACTT 19680 

ACACTCATAT 19740 

AACAATCTAT 198 00 

TGTATTACAA 19860 

CTGTTGTAGA 19920 

TTGGTGATAG 19980 

ATGTTAGAAT 2 0040 

TGGGAGGTGT 2 010 0 

ATGGATGGGA 2 016 0 

ATTTTGCCAT 2 0220 

TAGCGCTGTA 2 0280 

ACCCAAACAC 2 0 340 

ACATTAACCT 2 04 00 

ACCACCGCAA 20460 

CCTTCCACAT 2 0520 

GCTCATACAC 2 0580 

TAGGAAATGA 2 0640 

CCACCTTCTT 2 0700 

ACACCAACGA 2 0760 

CCGCCAACGC 2 0820 

GCTGGGCCTT 2 0880 

CTTATTACAC 2 0940 

CCTTTAAGAA 21000 

TGCTTACCCC 2106 0 

CCCAGTGTAA 21120 

GCTACCAGGG 21180 

ACTTCCAGCC 21240 

TGGGCATCCT 213 0 0 

GCGAAGGACA 213 60 

ACAGCATTAC 214 20 

GTAACTTTAT 214 80 
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GTCCATGGGC GCACTCACAG ACCTGGGCCA AAACCTTCTC TACGCCAACT CCGCCCACGC 2154 0 

GCTAGACATG ACTTTTGAGG TGGATCCCAT GGACGAGCCC ACCCTTCTTT ATGTTTTGTT 216 00 

TGAAGTCTTT GACGTGGTCC GTGTGCACCG GCCGCACCGC GGCGTCATCG AAACCGTGTA 216 6 0 

CCTGCGCACG CCCTTCTCGG CCGGCAACGC CACAACATAA AGAAGCAAGC AACATCAACA 21720 

ACAGCTGCCG CCATGGGCTC CAGTGAGCAG GAACTGAAAG CCATTGTCAA AGATCTTGGT 21780 

TGTGGGCCAT ATTTTTTGGG CACCTATGAC AAGCGCTTTC CAGGCTTTGT TTCTCCACAC 218 4 0 

AAGCTCGCCT GCGCCATAGT CAATACGGCC GGTCGCGAGA CTGGGGGCGT ACACTGGATG 21900 

GCCTTTGCCT GGAACCCGCA CTCAAAAACA TGCTACCTCT TTGAGCCCTT TGGCTTTTCT 21960 

GACCAGCGAC TCAAGCAGGT TTACCAGTTT GAGTACGAGT CACTCCTGCG CCGTAGCGCC 22 02 0 

ATTGCTTCTT CCCCCGACCG CTGTATAACG CTGGAAAAGT CCACCCAAAG CGTACAGGGG 2 2 080 

CCCAACTCGG CCGCCTGTGG ACTATTCTGC TGCATGTTTC TCCACGCCTT TGCCAACTGG 2214 0 

CCCCAAACTC CCATGGATCA CAACCCCACC ATGAACCTTA TTACCGGGGT ACCCAACTCC 22200 

ATGCTCAACA GTCCCCAGGT ACAGCCCACC CTGCGTCGCA ACCAGGAACA GCTCTACAGC 2 2 260 

TTCCTGGAGC GCCACTCGCC CTACTTCCGC AGCCACAGTG CGCAGATTAG GAGCGCCACT 22320 

TCTTTTTGTC ACTTGAAAAA CATGTAAAAA TAATGTACTA GAGACACTTT CAATAAAGGC 223 8 0 

AAATGCTTTT ATTTGTACAC TCTCGGGTGA TTATTTACCC CCACCCTTGC CGTCTGCGCC 2244 0 

GTTTAAAAAT CAAAGGGGTT CTGCCGCGCA TCGCTATGCG CCACTGGCAG GGACACGTTG 22 50 0 

CGATACTGGT GTTTAGTGCT CCACTTAAAC TCAGGCACAA CCATCCGCGG CAGCTCGGTG 22 56 0 

AAGTTTTCAC TCCACAGGCT GCGCACCATC ACCAACGCGT TTAGCAGGTC GGGCGCCGAT 2 2620 

ATCTTGAAGT CGCAGTTGGG GCCTCCGCCC TGCGCGCGCG AGTTGCGATA CACAGGGTTG 22680 

CAGCACTGGA ACACTATCAG CGCCGGGTGG TGCACGCTGG CCAGCACGCT CTTGTCGGAG 2 2 740 

ATCAGATCCG CGTCCAGGTC CTCCGCGTTG CTCAGGGCGA ACGGAGTCAA CTTTGGTAGC 22 8 00 

TGCCTTCCCA AAAAGGGCGC GTGCCCAGGC TTTGAGTTGC ACTCGCACCG TAGTGGCATC 2286 0 

AAAAGGTGAC CGTGCCCGGT CTGGGCGTTA GGATACAGCG CCTGCATAAA AGCCTTGATC 2 2 920 

TGCTTAAAAG CCACCTGAGC CTTTGCGCCT TCAGAGAAGA ACATGCCGCA AGACTTGCCG 22 98 0 

GAAAACTGAT TGGCCGGACA GGCCGCGTCG TGCACGCAGC ACCTTGCGTC GGTGTTGGAG 2 3 040 

ATCTGCACCA CATTTCGGCC CCACCGGTTC TTCACGATCT TGGCCTTGCT AGACTGCTCC 2 3100 

TTCAGCGCGC GCTGCCCGTT TTCGCTCGTC ACATCCATTT CAATCACGTG CTCCTTATTT 2 316 0 

ATCATAATGC TTCCGTGTAG ACACTTAAGC TCGCCTTCGA TCTCAGCGCA GCGGTGCAGC 2 322 0 

CACAACGCGC AGCCCGTGGG CTCGTGATGC TTGTAGGTCA CCTCTGCAAA CGACTGCAGG 2 3280 

TACGCCTGCA GGAATCGCCC CATCATCGTC ACAAAGGTCT TGTTGCTGGT GAAGGTCAGC 233 4 0 

TGCAACCCGC GGTGCTCCTC GTTCAGCCAG GTCTTGCATA CGGCCGCCAG AGCTTCCACT 2 3400 

TGGTCAGGCA GTAGTTTGAA GTTCGCCTTT AGATCGTTAT CCACGTGGTA CTTGTCCATC 2 3460 

AGCGCGCGCG CAGCCTCCAT GCCCTTCTCC CACGCAGACA CGATCGGCAC ACTCAGCGGG 2 3 520 

TTCATCACCG TAATTTCACT TTCCGCTTCG CTGGGCTCTT CCTCTTCCTC TTGCGTCCGC 23 58 0 
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ATACCACGCG 


CCACTGGGTC 


GTCTTCATTC 


AGCCGCCGCA 


CTGTGCGCTT 


ACCTCCTTTG 


23640 


CCATGCTTGA 


TTAGCACCGG 


TGGGTTGCTG 


AAACCCACCA 


TTTGTAGCGC 


CACATCTTCT 


23700 


CTTTCTTCCT 


CGCTGTCCAC 


GATTACCTCT 


GGTGATGGCG 


GGCGCTCGGG 


CTTGGGAGAA 


23760 


GGGCGCTTCT 


TTTTCTTCTT 


GGGCGCAATG 


GCCAAATCCG 


CCGCCGAGGT 


CGATGGCCGC 


23820 


GGGCTGGGTG 


TGCGCGGCAC 


CAGCGCGTCT 


TGTGATGAGT 


CTTCCTCGTC 


CTCGGACTCG 


23880 


ATACGCCGCC 


TCATCCGCTT 


TTTTGGGGGC 


GCCCGGGGAG 


GCGGCGGCGA 


CGGGGACGGG 


23940 


GACGACACGT 


CCTCCATGGT 


TGGGGGACGT 


CGCGCCGCAC 


CGCGTCCGCG 


CTCGGGGGTG 


24000 


GTTTCGCGCT 


GCTCCTCTTC 


CCGACTGGCC 


ATTTCCTTCT 


CCTATAGGCA 


GAAAAAGATC 


24060 


ATGGAGTCAG 


TCGAGAAGAA 


GGACAGCCTA 


ACCGCCCCCT 


CTGAGTTCGC 


CACCACCGCC 


24120 


TCCACCGATG 


CCGCCAACGC 


GCCTACCACC 


TTCCCCGTCG 


AGGCACCCCC 


GCTTGAGGAG 


24180 


GAGGAAGTGA 


TTATCGAGCA 


GGACCCAGGT 


TTTGTAAGCG 


AAGACGACGA 


GGACCGCTCA 


24240 


GTACCAACAG 


AGGATAAAAA 


GCAAGACCAG 


GACAACGCAG 


AGGCAAACGA 


GGAACAAGTC 


24300 


GGGCGGGGGG 


ACGAAAGGCA 


TGGCGACTAC 


CTAGATGTGG 


GAGACGACGT 


GCTGTTGAAG 


24360 


CATCTGCAGC 


GCCAGTGCGC 


CATTATCTGC 


GACGCGTTGC 


AAGAGCGCAG 


CGATGTGCCC 


24420 


CTCGCCATAG 


CGGATGTCAG 


CCTTGCCTAC 


GAACGCCACC 


TATTCTCACC 


GCGCGTACCC 


24480 


CCCAAACGCC 


AAGAAAACGG 


CACATGCGAG 


CCCAACCCGC 


GCCTCAACTT 


CTACCCCGTA 


24540 


TTTGCCGTGC 


CAGAGGTGCT 


TGCCACCTAT 


CACATCTTTT 


TCCAAAACTG 


CAAGATACCC 


24600 


CTATCCTGCC 


GTGCCAACCG 


CAGCCGAGCG 


GACAAGCAGC 


TGGCCTTGCG 


GCAGGGCGCT 


24660 


GTCATACCTG 


ATATCGCCTC 


GCTCAACGAA 


GTGCCAAAAA 


TCTTTGAGGG 


TCTTGGACGC 


24720 


GACGAGAAGC 


GCGCGGCAAA 


CGCTCTGCAA 


CAGGAAAACA 


GCGAAAATGA 


AAGTCACTCT 


24780 


GGAGTGTTGG 


TGGAACTCGA 


GGGTGACAAC 


GCGCGCCTAG 


CCGTACTAAA 


ACGCAGCATC 


24840 


GAGGTCACCC 


ACTTTGCCTA 


CCCGGCACTT 


AACCTACCCC 


CCAAGGTCAT 


GAGCACAGTC 


24900 


ATGAGTGAGC 


TGATCGTGCG 


CCGTGCGCAG 


CCCCTGGAGA 


GGGATGCAAA 


TTTGCAAGAA 


24960 


CAAACAGAGG 


AGGGCCTACC 


CGCAGTTGGC 


GACGAGCAGC 


TAGCGCGCTG 


GCTTCAAACG 


25020 


CGCGAGCCTG 


CCGACTTGGA 


GGAGCGACGC 


AAACTAATGA 


TGGCCGCAGT 


GCTCGTTACC 


25080 


GTGGAGCTTG 


AGTGCATGCA 


GCGGTTCTTT 


GCTGACCCGG 


AGATGCAGCG 


CAAGCTAGAG 


25140 


GAAACATTGC 


ACTACACCTT 


TCGACAGGGC 


TACGTACGCC 


AGGCCTGCAA 


GATCTCCAAC 


25200 


GTGGAGCTCT 


GCAACCTGGT 


CTCCTACCTT 


GGAATTTTGC 


ACGAAAACCG 


CCTTGGGCAA 


25260 


AACGTGCTTC 


ATTCCACGCT 


CAAGGGCGAG 


GCGCGCCGCG 


ACTACGTCCG 


CGACTGCGTT 


25320 


TACTTATTTC 


TATGCTACAC 


CTGGCAGACG 


GCCATGGGCG 


TTTGGCAGCA 


GTGCTTGGAG 


25380 


GAGTGCAACC 


TCAAGGAGCT 


GCAGAAACTG 


CTAAAGCAAA 


ACTTGAAGGA 


CCTATGGACG 


25440 


GCCTTCAACG 


AGCGCTCCGT 


GGCCGCGCAC 


CTGGCGGACA 


TCATTTTCCC 


CGAACGCCTG 


25500 


CTTAAAACCC 


TGCAACAGGG 


TCTGCCAGAC 


TTCACCAGTC 


AAAGCATGTT 


GCAGAACTTT 


25560 


AGGAACTTTA 


TCCTAGAGCG 


CTCAGGAATC 


TTGCCCGCCA 


CCTGCTGTGC 


ACTTCCTAGC 


25620 


GACTTTGTGC 


CCATTAAGTA 


CCGCGAATGC 


CCTCCGCCGC 


TTTGGGGCCA 


CTGC7ACCTT 


25680 
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CTGCAGCTAG 


CCAACTACCT 


TGCCTACCAC 


TCTGACATAA 


TGGAAGACGT 


GAGCGGTGAC 


25740 


GGTCTACTGG 


AGTGTCACTG 


TCGCTGCAAC 


CTATGCACCC 


CGCACCGCTC 


CCTGGTTTGC 


25800 


AATTCGCAGC 


TGCTTAACGA 


AAGTCAAATT 


ATCGGTACCT 


TTGAGCTGCA 


GGGTCCCTCG 


25860 


CCTGACGAAA 


AGTCCGCGGC 


TCCGGGGTTG 


AAACTCACTC 


CGGGGCTGTG 


GACGTCGGCT 


25920 


TACCTTCGCA 


AATTTGTACC 


TGAGGACTAC 


CACGCCCACG 


AGATTAGGTT 


CTACGAAGAC 


25980 


CAATCCCGCC 


CGCCAAATGC 


GGAGCTTACC 


GCCTGCGTCA 


TTACCCAGGG 


CCACATTCTT 


26040 


GGCCAATTGC 


AAGCCATCAA 


CAAAGCCCGC 


CAAGAGTTTC 


TGCTACGAAA 


GGGACGGGGG 


26100 


GTTTACTTGG 


ACCCCCAGTC 


CGGCGAGGAG 


CTCAACCCAA 


TCCCCCCGCC 


GCCGCAGCCC 


26160 


TATCAGCAGC 


AGCCGCGGGC 


CCTTGCTTCC 


CAGGATGGCA 


CCCAAAAAGA 


AGCTGCAGCT 


26220 


GCCGCCGCCA 


CCCACGGACG 


AGGAGGAATA 


CTGGGACAGT 


CAGGCAGAGG 


AGGTTTTGGA 


26280 


CGAGGAGGAG 


GAGGACATGA 


TGGAAGACTG 


GGAGAGCCTA 


GACGAGGAAG 


CTTCCGAGGT 


26340 


CGAAGAGGTG 


TCAGACGAAA 


CACCGTCACC 


CTCGGTCGCA 


TTCCCCTCGC 


CGGCGCCCCA 


26400 


GAAATCGGCA 


ACCGGTTCCA 


GCATGGCTAC 


AACCTCCGCT 


CCTCAGGCGC 


CGCCGGCACT 


26460 


GCCCGTTCGC 


CGACCCAACC 


GTAGATGGGA 


CACCACTGGA 


ACCAGGGCCG 


GTAAGTCCAA 


26520 


GCAGCCGCCG 


CCGTTAGCCC 


AAGAGCAACA 


ACAGCGCCAA 


GGCTACCGCT 


CATGGCGCGG 


26580 


GCACAAGAAC 


GCCATAGTTG 


CTTGCTTGCA 


AGACTGTGGG 


GGCAACATCT 


CCTTCGCCCG 


26640 


CCGCTTTCTT 


CTCTACCATC 


ACGGCGTGGC 


CTTCCCCCGT 


AACATCCTGC 


ATTACTACCG 


26700 


TCATCTCTAC 


AGCCCATACT 


GCACCGGCGG 


CAGCGGCAGC 


GGCAGCAACA 


GCAGCGGCCA 


26760 


CACAGAAGCA 


AAGGCGACCG 


GATAGCAAGA 


CTCTGACAAA 


GCCCAAGAAA 


TCCACAGCGG 


26820 


CGGCAGCAGC 


AGGAGGAGGA 


GCGCTGCGTC 


TGGCGCCCAA 


CGAACCCGTA 


TCGACCCGCG 


26880 


AGCTTAGAAA 


CAGGATTTTT 


CCCACTCTGT 


ATGCTATATT 


TCAACAGAGC 


AGGGGCCAAG 


26940 


AACAAGAGCT 


GAAAATAAAA 


AACAGGTCTC 


TGCGATCCCT 


CACCCGCAGC 


TGCCTGTATC 


27000 


ACAAAAGCGA 


AGATCAGCTT 


CGGCGCACGC 


TGGAAGACGC 


GGAGGCTCTC 


TTCAGTAAAT 


27060 


ACTGCGCGCT 


GACTCTTAAG 


GACTAGTTTC 


GCGCCCTTTC 


TCAAATTTAA 


GCGCGAAAAC 


27120 


TACGTCATCT 


CCAGCGGCCA 


CACCCGGCGC 


CAGCACCTGT 


CGTCAGCGCC 


ATTATGAGCA 


27180 


AGGAAATTCC 


CACGCCCTAC 


ATGTGGAGTT 


ACCAGCCACA 


AATGGGACTT 


GCGGCTGGAG 


27240 


CTGCCCAAGA 


CTACTCAACC 


CGAATAAACT 


ACATGAGCGC 


GGGACCCCAC 


ATGATATCCC 


27300 


GGGTCAACGG 


AATCCGCGCC 


CACCGAAACC 


GAATTCTCTT 


GGAACAGGCG 


GCTATTACCA 


27360 


CCACACCTCG 


TAATAACCTT 


AATCCCCGTA 


GTTGGCCCGC 


TGCCCTGGTG 


TACCAGGAAA 


27420 


GTCCCGCTCC 


CACCACTGTG 


GTACTTCCCA 


GAGACGCCCA 


GGCCGAAGTT 


CAGATGACTA 


27480 


ACTCAGGGGC 


GCAGCTTGCG 


GGCGGCTTTC 


GTCACAGGGT 


GCGGTCGCCC 


GGGCAGGGTA 


27540 


TAACTCACCT 


GACAATCAGA 


GGGCGAGGTA 


TTCAGCTCAA 


CGACGAGTCG 


GTGAGCTCCT 


27600 


CGCTTGGTCT 


CCGTCCGGAC 


GGGACATTTC 


AGATCGGCGG 


CGCCGGCCGT 


CCTTCATTCA 


27660 


CGCCTCGTCA 


GGCAATCCTA 


ACTCTGCAGA 


CCTCGTCCTC 


TGAGCCGCGC 


TCTGGAGGCA 


27720 


TTGGAACTCT 


GCAATTTATT 


GAGGAGTTTG 


TGCCATCGGT 


CTACTTTAAC 


CCCTTCTCGG 


27780 
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TTCCTAACTT 


TGACGCGGTA 


AAGGACTCGG 


27840 


CGGACGGCTA 


CGACTGAATG 


TTAAGTGGAG 


AGGCAGAGCA 


ACTGCGCCTG 


AAACACCTGG 


27900 


TCCACTGTCG 


CCGCCACAAG 


TGCTTTGCCC 


GCGACTCCGG 


TGAGTTTTGC 


TACTTTGAAT 


27960 


TGCCCGr-.GG A 


TCATATCGAG 


GGCCCGGCGC 


ACGGCGTCCG 


GCTTACCGCC 


CAGGGAGAGC 


28020 


TTGCCCGTAG 


CCTGATTCGG 


GAGTTTACCC 


AGCGCCCCCT 


GCTAGTTGAG 


CGGGACAGGG 


28080 


GACCCTGTGT 


TCTCACTGTG 


ATTTGCAACT 


GTCCTAACCT 


TGGATTACAT 


CAAGATCTTT 


28140 


GTTGCCATCT 


CTGTGCTGAG 


TATAATAAAT 


ACAGAAATTA 


AAATATACTG 


GGGCTCCTAT 


28200 


CGCCATCCTG 


TAAACGCCAC 


CGTCTTCACC 


CGCCCAAGCA 


AACCAAGGCG 


AACCTTACCT 


28260 


GGTACTTTTA 


ACATCTCTCC 


CTCTGTGATT 


TACAACAGTT 


TCAACCCAGA 


CGGAGTGAGT 


28320 


CTACGAGAGA 


ACCTCTCCGA 


GCTCAGCTAC 


TCCATCAGAA 


AAAACACCAC 


CCTCCTTACC 


28380 


TGCCGGGAAC 


GTACGAGTGC 


GTCACCGGCC 


GCTGCACCAC 


ACCTACCGCC 


TGACCGTAAA 


28440 


CCAGACTTTT 


TCCGGACAGA 


CCTCAATAAC 


TCTGTTTACC 


AGAACAGGAG 


GTGAGCTTAG 


28500 


AAAACCCTTA 


GGGTATTAGG 


CCAAAGGCGC 


AGCTACTGTG 


GGGTTTATGA 


ACAATTCAAG 


28560 


CAACTCTACG 


GGCTATTCTA 


ATTCAGGTTT 


CTCTAGAATC 


GGGGTTGGGG 


TTATTCTCTG 


28620 


TCTTGTGATT 


CTCTTTATTC 


TTATACTAAC 


GCTTCTCTGC 


CTAAGGCTCG 


CCGCCTGCTG 


28680 


TGTGCACATT 


TGCATTTATT 


GTCAGCTTTT 


TAAACGCTGG 


GGTCGCCACC 


CAAGATGATT 


28740 


AGGTACATAA 


TCCTAGGTTT 


ACTCACCCTT 


GCGTCAGCCC 


ACGGTACCAC 


CCAAAAGGTG 


28800 


GATTTTA-^GG 


AGCCAGCCTG 


TAATGTTACA 


TTCGCAGCTG 


AAGCTAATGA 


GTGCACCACT 


28860 


CTTATAAAAT 


GCACCACAGA 


ACATGAAAAG 


CTGCTTATTC 


GCCACAAAAA 


CAAAATTGGC 


28920 


AAGTATGCTG 


TTTATGCTAT 


TTGGCAGCCA 


GGTGACACTA 


CAGAGTATAA 


TGTTACAGTT 


28980 


TTCCAGGGTA 


AAAGTCATAA 


AACTTTTATG 


TATACTTTTC 


CATTTTATGA 


AATGTGCGAC 


29040 


ATTACCATGT 


ACATGAGCAA 


ACAGTATAAG 


TTGTGGCCCC 


CACAAAATTG 


TGTGGAAAAC 


29100 


ACTGGCACTT 


TCTGCTGCAC 


TGCTATGCTA 


ATTACAGTGC 


TCGCTTTGGT 


CTGTACCCTA 


29160 


CTCTATATTA 


AATACAAAAG 


CAGACGCAGC 


TTTATTGAGG 


AAAAGAAAAT 


GCCTTAATTT 


29220 


ACTAAGTTAC 


AAAGCTAATG 


TCACCACTAA 


CTGCTTTACT 


CGCTGCTTGC 


AAAACAAATT 


29280 


CAAAAAGTTA 


GCATTATAAT 


TAGAATAGGA 


TTTAAACCCC 


CCGGTCATTT 


CCTGCTCAAT 


29340 


ACCATTCCCC 


TGAACAATTG 


ACTCTATGTG 


GGATATGCTC 


CAGCGCTACA 


ACCTTGAAGT 


29400 


CAGGCTTCCT 


GGATGTCAGC 


ATCTGACTTT 


GGCCAGCACC 


TGTCCCGCGG 


ATTTGTTCCA 


29460 


GTCCAACTAC 


AGCGACCCAC 


CCTAACAGAG 


ATGACCAACA 


CAACCAACGC 


GGCCGCCGCT 


29520 


ACCGGACTTA 


CATCTACCAC 


AAATACACCC 


CAAGTTTCTG 


CCTTTGTCAA 


TAACTGGGAT 


29580 


AACTTGGGCA 


TGTGGTGGTT 


CTCCATAGCG 


CTTATGTTTG 


TATGCCTTAT 


TATTATGTGG 


29640 


CTCATCTGCT 


GCCTAAAGCG 


CAAACGCGCC 


CGACCACCCA 


TCTATAGTCC 


CATCATTGTG 


29700 


CTACACCCAA 


ACT^TGATGG 


AATCCATAGA 


TTGGACGGAC 


TGAAACACAT 


GTTCTTTTCT 


29760 


CTTACAGTAT 


GATTAAATGA 


GACATGATTC 


CTCGAGTTTT 


TATATTACTG 


ACCCTTGTTG 


29820 


CGCTTTTTTG 


TGCGTGCTCC 


ACATTGGCTG 


CGGTTTCTCA 


CATCGAAGTA 


GACTGCATTC 


29880 
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CAGCCTTCAC AGTCTATTTG CTTTACGGAT TTGTCACCCT CACGCTCATC TGCAGCCTCA 2 9 940 

TCACTGTGGT CATCGCCTTT ATCCAGTGCA TTGACTGGGT CTGTGTGCGC TTTGCATATC 3 0000 

TCAGACACCA TCCCCAGTAC AGGGACAGGA CTATAGCTGA GCTTCTTAGA ATTCTTTAAT 3 0060 

TATGAAATTT ACTGTGACTT TTCTGCTGAT TATTTGCACC CTATCTGCGT TTTGTTCCCC 3 012 0 

GACCTCCAAG CCTCAAAGAC ATATATCATG CAGATTCACT CGTATATGGA ATATTCCAAG 3 018 0 

TTGCTACAAT GAAAAAAGCG ATCTTTCCGA AGCCTGGTTA TATGCAATCA TCTCTGTTAT 3 024 0 

GGTGTTCTGC AGTACCATCT TAGCCCTAGC TATATATCCC TACCTTGACA TTGGCTGGAA 3 0300 

ACGAATAGAT GCCATGAACC ACCCAACTTT CCCCGCGCCC GCTATGCTTC CACTGCAACA 3 0360 

AGTTGTTGCC GGCGGCTTTG TCCCAGCCAA TCAGCCTCGC CCCACTTCTC CCACCCCCAC 3 0420 

TGAAATCAGC TACTTTAATC TAACAGGAGG AGATGACTGA CACCCTAGAT CTAGAAATGG 3 0480 

ACGGAATTAT TACAGAGCAG CGCCTGCTAG AAAGACGCAG GGCAGCGGCC GAGCAACAGC 3 0 540 

GCATGAATCA AGAGCTCCAA GACATGGTTA ACTTGCACCA GTGCAAAAGG GGTATCTTTT 3 0600 

GTCTGGTAAA GCAGGCCAAA GTCACCTACG ACAGTAATAC CACCGGACAC CGCCTTAGCT 3 0660 

ACAAGTTGCC AACCAAGCGT CAGAAATTGG TGGTCATGGT GGGAGAAAAG CCCATTACCA 3 072 0 

TAACTCAGCA CTCGGTAGAA ACCGAAGGCT GCATTCACTC ACCTTGTCAA GGACCTGAGG 3 0780 

ATCTCTGCAC CCTTATTAAG ACCCTGTGCG GTCTCAAAGA TCTTATTCCC TTTAACTAAT 3 0840 

AAAAAAAAAT AATAAAGCAT CACTTACTTA AAATCAGTTA GCAAATTTCT GTCCAGTTTA 3 0 900 

TTCAGCAGCA CCTCCTTGCC CTCCTCCCAG CTCTGGTATT GCAGCTTCCT CCTGGCTGCA 3 0 960 

AACTTTCTCC ACAATCTAAA TGGAATGTCA GTTTCCTCCT GTTCCTGTCC ATCCGCACCC 31020 

ACTATCTTCA TGTTGTTGCA GATGAAGCGC GCAAGACCGT CTGAAGATAC CTTCAACCCC 31080 

GTGTATCCAT ATGACACGGA AACCGGTCCT CCAACTGTGC CTTTTCTTAC TCCTCCCTTT 3114 0 

GTATCCCCCA ATGGGTTTCA AGAGAGTCCC CCTGGGGTAC TCTCTTTGCG CCTATCCGAA 312 0 0 

CCTCTAGTTA CCTCCAATGG CATGCTTGCG CTCAAAATGG GCAACGGCCT CTCTCTGGAC 31260 

GAGGCCGGCA ACCTTACCTC CCAAAATGTA ACCACTGTGA GCCCACCTCT CAAAAAAACC 313 2 0 

AAGTCAAACA TAAACCTGGA AATATCTGCA CCCCTCACAG TTACCTCAGA AGCCCTAACT 313 8 0 

GTGGCTGCCG CCGCACCTCT AATGGTCGCG GGCAACACAC TCACCATGCA ATCACAGGCC 3144 0 

CCGCTAACCG TGCACGACTC CAAACTTAGC ATTGCCACCC AAGGACCCCT CACAGTGTCA 31500 

GAAGGAAAGC TAGCCCTGCA AACATCAGGC CCCCTCACCA CCACCGATAG CAGTACCCTT 31560 

ACTATCACTG CCTCACCCCC TCTAACTACT GCCACTGGTA GCTTGGGCAT TGACTTGAAA 316 2 0 

GAGCCCATTT ATACACAAAA TGGAAAACTA GGACTAAAGT ACGGGGCTCC TTTGCATGTA 316 8 0 

ACAGACGACC TAAACACTTT GACCGTAGCA ACTGGTCCAG GTGTGACTAT TAATAATACT 31740 

TCCTTGCAAA CTAAAGTTAC TGGAGCCTTG GGTTTTGATT CACAAGGCAA TATGCAACTT 31800 

AATGTAGCAG GAGGACTAAG GATTGATTCT CAAAACAGAC GCCTTATACT TGATGTTAGT 318 6 0 

TATCCGTTTG ATGCTCAAAA CCAACTAAAT CTAAGACTAG GACAGGGCCC TCTTTTTATA 31920 

AACTCAGCCC ACAACTTGGA TATTAACTAC AACAAAGGCC TTTACTTGTT TACAGCTTCA 31980 
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AACAATTCCA 


AAAAGCTTGA 


GGTTAACCTA 


AGCACTGCCA 


AGGGGTTGAT 


GTTTGACGCT 


32040 


ACAGCCATAG 


CCATTAATGC 


AGGAGATGGG 


CTTGAATTTG 


GTTCACCTAA 


TGCACCAAAC 


32100 


ACAAATCCCC 


TCAAAACAAA 


AATTGGCCAT 


GGCCTAGAAT 


TTGATTCAAA 


CAAGGCTATG 


32160 


GTTCCTAAAC 


TAGGAACTGG 


CCTTAGTTTT 


GACAGCACAG 


GTGCCATTAC 


AGTAGGAAAC 


32220 


AAAAATAATG 


ATAAGCTAAC 


TTTGTGGACC 


ACACCAGCTC 


CATCTCCTAA 


CTGTAGACTA 


32280 


AATGCAGAGA 


AAGATGCTAA 


ACTCACTTTG 


GTCTTAACAA 


AATGTGGCAG 


TCAAATACTT 


32340 


GCTACAGTTT 


CAGTTTTGGC 


TGTTAAAGGC 


AGTTTGGCTC 


CAATATCTGG 


AACAGTTCAA 


32400 


AGTGCTCATC 


TTATTATAAG 


ATTTGACGAA 


AATGGAGTGC 


TACTAAACAA 


TTCCTTCCTG 


32460 


GACCCAGAAT 


ATTGGAACTT 


TAGAAATGGA 


GATCTTACTG 


AAGGCACAGC 


CTATACAAAC 


32520 


GCTGTTGGAT 


TTATGCCTAA 


CCTATCAGCT 


TATCCAAAAT 


CTCACGGTAA 


AACTGCCAAA 


32580 


AGTAACATTG 


TCAGTCAAGT 


TTACTTAAAC 


GGAGACAAAA 


CTAAACCTGT 


AACACTAACC 


32640 


ATTACACTAA 


ACGGTACACA 


GGAAACAGGA 


GACACAACTC 


CAAGTGCATA 


CTCTATGTCA 


32700 


TT7TCATGGG 


ACTGGTCTGG 


CCACAACTAC 


ATTAATGAAA 


TATTTGCCAC 


ATCCTCTTAC 


32760 


ACTTTTTCAT 


ACATTGCCCA 


AGAATAAAGA 


ATCGTTTGTG 


TTATGTTTCA 


ACGTGTTTAT 


32820 


TTTTCAATTG 


CAGAAAATTT 


CAAGTCATTT 


TTCATTCAGT 


AGTATAGCCC 


CACCACCACA 


32880 


TAGCTTATAC 


AGATCACCGT 


ACCTTAATCA 


AACTCACAGA 


ACCCTAGTAT 


TCAACCTGCC 


32940 


ACCTCCCTCC 


CAACACACAG 


AGTACACAGT 


CCTTTCTCCC 


CGGCTGGCCT 


TAAAAAGCAT 


33000 


CATATCATGG 


GTAACAGACA 


TATTCTTAGG 


TGTTATATTC 


CACACGGTTT 


CCTGTCGAGC 


33060 


CAAACGCTCA 


TCAGTGATAT 


TAATAAACTC 


CCCGGGCAGC 


TCACTTAAGT 


TCATGTCGCT 


33120 


GTCCAGCTGC 


TGAGCCACAG 


GCTGCTGTCC 


AACTTGCGGT 


TGCTTAACGG 


GCGGCGAAGG 


33180 


AGAAGTCCAC 


GCCTACATGG 


GGGTAGAGTC 


ATAATCGTGC 


ATCAGGATAG 


GGCGGTGGTG 


33240 


CTGCAGCAGC 


GCGCGAATAA 


ACTGCTGCCG 


CCGCCGCTCC 


GTCCTGCAGG 


AATACAACAT 


33300 


GGCAGTGGTC 


TCCTCAGCGA 


TGATTCGCAC 


CGCCCGCAGC 


ATAAGGCGCC 


TTGTCCTCCG 


33360 


GGCACAGCAG 


CGCACCCTGA 


TCTCACTTAA 


ATCAGCACAG 


TAACTGCAGC 


ACAGCACCAC 


33420 


AATATTGTTC 


AAAATCCCAC 


AGTGCAAGGC 


GCTGTATCCA 


AAGCTCATGG 


CGGGGACCAC 


33480 


AGAACCCACG 


TGGCCATCAT 


ACCACAAGCG 


CAGGTAGATT 


AAGTGGCGAC 


CCCTCATAAA 


33540 


CACGCTGGAC 


ATAAACATTA 


CCTCTTTTGG 


CATGTTGTAA 


TTCACCACCT 


CCCGGTACCA 


33S00 


TATAAACCTC 


TGATTAAACA 


TGGCGCCATC 


CACCACCATC 


CTAAACCAGC 


TGGCCAAAAC 


33660 


CTGCCCGCCG 


GCTATACACT 


GCAGGGAACC 


GGGACTGGAA 


CAATGACAGT 


GGAGAGCCCA 


33720 


GGACTCGTAA 


CCATGGATCA 


TCATGCTCGT 


CATGATATCA 


ATGTTGGCAC 


AACACAGGCA 


33780 


CACGTGCATA 


CACTTCCTCA 


GGATTACAAG 


CTCCTCCCGC 


GTTAGAACCA 


TATCCCAGGG 


33840 


AACAACCCAT 


TCCTGAATCA 


GCGTAAATCC 


CACACTGCAG 


GGAAGACCTC 


GCACGTAACT 


33900 


CACGTTGTGC 


ATTGTCAAAG 


TGTTACATTC 


GGGCAGCAGC 


GGATGATCCT 


CCAGTATGGT 


33960 


AGCGCGGGTT 


TCTGTCTCAA 


AAGGAGGTAG 


ACGATCCCTA 


CTGTACGGAG 


TGCGCCGAGA 


34020 


CAACCGAGAT 


CGTGTTGGTC 


GTAGTGTCAT 


GCCAAATGGA 


ACGCCGGACG 


TAGTCATATT 


34080 
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TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT 
TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT 
TGGCTTCGGG TTCTATG7AA ACTCCTTCAT GCGCCGCTGC 
CAGAATAAGC CACACCCAGC CAACCTACAC ATTCGTTCTG 
CGGGAAGAGC TGGAAGAACC ATGTTTTTTT TTTTATTCCA 
AAATGAAGAT CTATTAAGTG AACGCGCTCC CCTCCGGTGG 
AAAGAACAGA TAATGGCATT TGTAAGATGT TGCACAATGG 
CTCACGTCCA AGTGGACGTA AAGGCTAAAC CCTTCAGGGT 
CCAGCACCTT CAACCATGCC CAAATAATTC TCATCTCGCC 
AGCAAATCCC GAATATTAAG TCCGGCCATT GTAAAAATCT 
TTCAGCCTCA AGCAGCGAAT CATGATTGCA AAAATTCAGG 
GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG 
GCTGAACATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC 
TGACAAAAGA ACCCACACTG ATTATGACAC GCATACTCGG 
CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT 
TCAGGCAAAG CCTCGCGCAA AAAAGAAAGC ACATCGTAGT 
CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT 
GGTTTCTGCA TAAACACAAA ATAAAATAAC AAAAAAACAT 
TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC 
CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA 
GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT 
CGACCGAAAT AGCCCGGGGG AATACATACC CGCAGGCGTA 
ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT 
TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT 
CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA 
GCACCAGCTC AATCAGTCAC AGTGTAAAAA AGGGCCAAGT 
ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC 
CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT 
CCCACGTTAC GTAACTTCCC ATTTTAAGAA AACTACAATT 
CCGCCCTAAA ACCTACGTCA CCCGCCCCGT TCCCACGCCC 
ACCCCCTCAT TATCATATTG GCTTCAATCC AAAATAAGGT 
(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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GCGTCTCCGG TCTCGCCGCT 34140 

CAAAGCATCC AGGCGCCCCC 34 2 00 

CCTGATAACA TCCACCACCG 3 4260 

CGAGTCACAC ACGGGAGGAG 34 320 

AAAGATTATC CAAAACCTCA 34 380 

CGTGGTCAAA CTCTACAGCC 34440 

CTTCCAAAAG GCAAACGGCC 3 4 500 

GAATCTCCTC TATAAACATT 3 4 560 

ACCTTCTCAA TATATCTCTA 34620 

GCTCCAGAGC GCCCTCCACC 3 4680 

TTCCTCACAG ACCTGTATAA 34 740 

TAGGTCCCTT CGCAGGGCCA 34800 

CACTTCCCCG CCAGGAACCT 34860 

AGCTATGCTA ACCAGCGTAG 34 92 0 

GCAAGGTGCT GCTCAAAAAA 3 4 980 

CATGCTCATG CAGATAAAGG 3 5040 

TTCTCTCAAA CATGTCTGCG 3 5100 

TTAAACATTA GAAGCCTGTC 3 516 0 

TACGGCCATG CGGGCGTGAC 3 5220 

CAGCTCCTCG GTCATGTCCG 3 5280 

CATCGGTCAG TGCTAAAAAG 3 5340 

GAGACAACAT TACAGCCCCC 3 5400 

AAACACCTGA AAAACCCTCC 3 5460 

ACAGCGCTTC ACAGCGGCAG 3 552 0 

AAAAAACACC ACTCGACACG 3 5580 

GCAGAGCGAG TATATATAGG 3 5640 

CCAGAAAACC GCACGCGAAC 357 00 

CAAATCGTCA CTTCCGTTTT 3 5760 

CCCAACACAT ACAAGTTACT 3 5820 

CGCGCCACGT CACAAACTCC 3 58 80 

ATATTATTGA TGATG 3 5 935 
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(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

TTCATTTTAT GTTTCAGGTT CAGGG 2 5 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TTACCGCCAC ACTCGCAGGG 2 0 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34303 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 ; 

TTATTTTGGA TTGAAGCCAA TATGATAATG AGGGGGTGGA GTTTGTGACG TGGCGCGGGG 6 0 

CGTGGGAACG GGGCGGGTGA CGTAGTAGTG TGGCGGAAGT GTGATGTTGC AAGTGTGGCG 120 

GAACACATGT AAGCGACGGA TGTGGCAAAA GTGACGTTTT TGGTGTGCGC CGGATCCACA 180 

GGACGGGTGT GGTCGCCATG ATCGCGTAGT CGATAGTGGC TCCAAGTAGC GAAGCGAGCA 24 0 

GGACTGGGCG GCGGCCAAAG CGGTCGGACA GTGCTCCGAG AACGGGTGCG CATAGAAATT 3 00 

GCATCAACGC ATATAGCGCT AGCAGCACGC CATAGTGACT GGCGATGCTG TCGGAATGGA 360 

CGATATCCCG CAAGAGGCCC GGCAGTACCG GCATAACCAA GCCTATGCCT ACAGCATCCA 420 

GGGTGACGGT GCCGAGGATG ACGATGAGCG CATTGTTAGA TTTCATACAC GGTGCCTGAC 480 

TGCGTTAGCA ATTTAACTGT GATAAACTAC CGCATTAAAG CTTATCGATG ATAAGCTGTC 54 0 

AAACATGAGA ATTCTTGAAG ACGAAAGGGC CTCGTGATAC GCCTATTTTT ATAGGTTAAT 600 

GTCATGATAA TAATGGTTTC TTAGACGTCA GGTGGCACTT TTCGGGGAAA TGTGCGCGGA 660 

ACCCCTATTT GTTTATTTTT CTAAATACAT TCAAATATGT ATCCGCTCAT GAGACAATAA 72 0 

CCCTGATAAA TGCTTCAATA ATATTGAAAA AGGAAGAGTA TGAGTATTCA ACATTTCCGT 780 

GTCGCCCTTA TTCCCTTTTT TGCGGCATTT TGCCTTCCTG TTTTTGCTCA CCCAGAAACG 84 0 

CTGGTGAAAG TAAAAGATGC TGTVAGATCAG TTGGGTGCAC GAGTGGGTTA CATCGAACTG 900 

* GATCTCAACA GCGGTAAGAT CCTTGAGAGT TTTCGCCCCG AAGAACGTTT TCCAATGATG 960 

AGCACTTTTA AAGTTCTGCT ATGTGGCGCG GTATTATCCC GTGTTGACGC CGGGCAAGAG 102 0 

C;^CTCGGTC GCCGCATACA CTATTCTCAG AATGACTTGG TTGAGTACTC ACCAGTCACA 1080 
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GAAAAGCATC TTACGGATGG CATGACAGTA AGAGAATTAT GCAGTGCTGC CATAACCATG 114 0 

AGTGATAACA CTGCGGCCAA CTTACTTCTG ACAACGATCG GAGGACCGAA GGAGCTAACC 120 0 

GCTTTTTTGC ACAACATGGG GGATCATGTA ACTCGCCTTG ATCGTTGGGA ACCGGAGCTG 126 0 

AATGAAGCCA TACCAAACGA CGAGCGTGAC ACCACGATGC CTGCAGCAAT GGCAACAACG 132 0 

TTGCGCAAAC TATTAACTGG CGAACTACTT ACTCTAGCTT CCCGGCAACA ATTAATAGAC 13 8 0 

TGGATGGAGG CGGATAAAGT TGCAGGACCA CTTCTGCGCT CGGCCCTTCC GGCTGGCTGG 144 0 

TTTATTGCTG ATAAATCTGG AGCCGGTGAG CGTGGGTCTC GCGGTATCAT TGCAGCACTG 150 0 

GGGCCAGATG GTAAGCCCTC CCGTATCGTA GTTATCTACA CGACGGGGAG TCAGGCAACT 1560 

ATGGATGAAC GAAATAGACA GATCGCTGAG ATAGGTGCCT CACTGATTAA GCATTGGTAA 162 0 

CTGTCAGACC AAGTTTACTC ATATATACTT TAGATTGATT TAAAACTTCA TTTTTAATTT 168 0 

AAAAGGATCT AGGTGAAGAT CCTTTTTGAT AATCTCATGA CCAAAATCCC TTAACGTGAG 174 0 

TTTTCGTTCC ACTGAGCGTC AGACCCCGTA GAAAAGATCA AAGGATCTTC TTGAGATCCT 180 0 

TTTTTTCTGC GCGTAATCTG CTGCTTGCAA ACAAAAAAAC CACCGCTACC AGCGGTGGTT 186 0 

TGTTTGCCGG ATCAAGAGCT ACCAACTCTT TTTCCGAAGG TAACTGGCTT CAGCAGAGCG 192 0 

CAGATACCAA ATACTGTCCT TCTAGTGTAG CCGTAGTTAG GCCACCACTT CAAGAACTCT 198 0 

GTAGCACCGC CTACATACCT CGCTCTGCTA ATCCTGTTAC CAGTGGCTGC TGCCAGTGGC 204 0 

GATAAGTCGT GTCTTACCGG GTTGGACTCA AGACGATAGT TACCGGATAA GGCGCAGCGG 210 0 

TCGGGCTGAA CGGGGGGTTC GTGCACACAG CCCAGCTTGG AGCGAACGAC CTACACCGAA 216 0 

CTGAGATACC TACAGCGTGA GCATTGAGAA AGCGCCACGC TTCCCGAAGG GAGAAAGGCG 222 0 

GACAGGTATC CGGTAAGCGG CAGGGTCGGA ACAGGAGAGC GCACGAGGGA GCTTCCAGGG 228 0 

GGAAACGCCT GGTATCTTTA TAGTCCTGTC GGGTTTCGCC ACCTCTGACT TGAGCGTCGA 234 0 

TTTTTGTGAT GCTCGTCAGG GGGGCGGAGC CTATGGAAAA ACGCCAGCAA CGCGGCCTTT 24 00 

TTACGGTTCC TGGCCTTTTG CTGGCCTTTT GCTCACATGT TCTTTCCTGC GTTATCCCCT 246 0 

GATTCTGTGG ATAACCGTAT TACCGCCTTT GAGTGAGCTG ATACCGCTCG CCGCAGCCGA 252 0 

ACGACCGAGC GCAGCGAGTC AGTGAGCGAG GAAGCGGAAG AGCGCCTGAT GCGGTATTTT 2 58 0 

CTCCTTACGC ATCTGTGCGG TATTTCACAC CGCATATGGT GCACTCTCAG TACAATCTGC 264 0 

TCTGATGCCG CATAGTTAAG CCAGTATACA CTCCGCTATC GCTACGTGAC TGGGTCATGG 2700 

CTGCGCCCCG ACACCCGCCA ACACCCGCTG ACGCGCCCTG ACGGGCTTGT CTGCTCCCGG 2760 

CATCCGCTTA CAGACAAGCT GTGACCGTCT CCGGGAGCTG CATGTGTCAG AGGTTTTCAC 2820 

CGTCATCACC GAAACGCGCG AGGCAGTCTA GACAATAGTA GTACGGATAG CTGTGACTCC 2880 

GGTCCTTCTA ACACACCTCC TGAGATACAC CCGGTGGTCC CGCTGTGCCC CATTAAACCA 2 94 0 

GTTGCCGTGA GAGTTGGTGG GCGTCGCCAG GCTGTGGAAT GTATCGAGGA CTTGCTTAAC 3000 

GAGCCTGGGC AACCTTTGGA CTTGAGCTGT AAACGCCCCA GGCCATAAGG TGTAAACCTG 3 060 

TGATTGC3TG TGTGGTTAAC GCCTTTGTTT GCTGAATGAG TTGATGTAAG TTTAATAAAG 3120 

GGTGAGATAA TGTTTAACTT GCATGGCGTG TTAAATGGGG CGGGGCTTAA AGGGTATATA 3180 
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ATGCGCCGTG 


GGCTAATCTT 


GGTTACATCT 


GACCTCATGG 


AGGCTTGGGA 


GTGTTTGGAA 


3240 


GATTTTTCTG 


CTGTGCGTAA 


CTTGCTGGAA 


CAGAGCTCTA 


ACAGTACCTC 


TTGGTTTTGG 


3300 


AGGTTTCTGT 


GGGGCTCATC 


CCAGGCAAAG 


TTAGTCTGCA 


GAATTAAGGA 


GGATTACAAG 


3360 


TGGGAATTTG 


AAGAGCTTTT 


GAAATCCTGT 


GGTGAGCTGT 


TTGATTCTTT 


GAATCTGGGT 


3420 


CACCAGGCGC 


TTTTCCAAGA 


GAAGGTCATC 


AAGACTTTGG 


ATTTTTCCAC 


ACCGGGGCGC 


3480 


GCTGCGGCTG 


CTGTTGCTTT 


TTTGAGTTTT 


ATAAAGGATA 


AATGGAGCGA 


AGAAACCCAT 


3540 


CTGAGCGGGG 


GGTACCTGCT 


GGATTTTCTG 


GCCATGCATC 


TGTGGAGAGC 


GGTTGTGAGA 


3600 


CACAAGAATC 


GCCTGCTACT 


GTTGTCTTCC 


GTCCGCCCGG 


CGATAATACC 


GACGGAGGAG 


3660 


CAGCAGCAGC 


AGCAGGAGGA 


AGCCAGGCGG 


CGGCGGCAGG 


AGCAGAGCCC 


ATGGAACCCG 


3720 


AGAGCCGGCC 


TGGACCCTCG 


GGAATGAATG 


TTGTACAGGT 


GGCTGAACTG 


TATCCAGAAC 


3780 


TGAGACGCAT 


TTTGACAATT 


ACAGAGGATG 


GGCAGGGGCT 


AAAGGGGGTA 


AAGAGGGAGC 


3840 


GGGGGGCTTG 


TGAGGCTACA 


GAGGAGGCTA 


GGAATCTAGC 


TTTTAGCTTA 


ATGACCAGAC 


3900 


ACCGTCCTGA 


GTGTATTACT 


TTTCAACAGA 


TCAAGGATAA 


TTGCGCTAAT 


GAGCTTGATC 


3960 


TGCTGGCGCA 


GAAGTATTCC 


ATAGAGCAGC 


TGACCACTTA 


CTGGCTGCAG 


CCAGGGGATG 


4020 


ATTTTGAGGA 


GGCTATTAGG 


GTATATGCAA 


AGGTGGCACT 


TAGGCCAGAT 


TGCAAGTACA 


4080 


AGATCAGCAA 


ACTTGTAAAT 


ATCAGGAATT 


GTTGCTACAT 


TTCTGGGAAC 


GGGGCCGAGG 


4140 


TGGAGATAGA 


TACGGAGGAT 


AGGGTGGCCT 


TTAGATGTAG 


CATGATAAAT 


ATGTGGCCGG 


4200 


GGGTGCTTGG 


CATGGACGGG 


GTGGTTATTA 


TGAATGTAAG 


GTTTACTGGC 


CCCAATTTTA 


4260 


GCGGTACGGT 


TTTCCTGGCC 


AATACCAACC 


TTATCCTACA 


CGGTGTAAGC 


TTCTATGGGT 


4320 


TTAACAATAC 


CTGTGTGGAA 


GCCTGGACCG 


ATGTAAGGGT 


TCGGGGCTGT 


GCCTTTTACT 


4380 


GCTGCTGGAA 


GGGGGTGGTG 


TGTCGCCCCA 


AAAGCAGGGC 


TTCAATTAAG 


AAATGCCTCT 


4440 


TTGAAAGGTG 


TACCTTGGGT 


ATCCTGTCTG 


AGGGTAACTC 


CAGGGTGCGC 


CACAATGTGG 


4500 


CCTCCGACTG 


TGGTTGCTTC 


ATGCTAGTGA 


AAAGCGTGGC 


TGTGATTAAG 


CATAACATGG 


4560 


TATGTGGCAA 


CTGCGAGGAC 


AGGGCCTCTC 


AGATGCTGAC 


CTGCTCGGAC 


GGCAACTGTC 


4620 


ACCTGCTGAA 


GACCATTCAC 


GTAGCCAGCC 


ACTCTCGCAA 


GGCCTGGCCA 


GTGTTTGAGC 


4680 


ATAACATACT 


GACCCGCTGT 


TCCTTGCATT 


TGGGTAACAG 


GAGGGGGGTG 


TTCCTACCTT 


4740 


ACCT^TGCAA 


TTTGAGTCAC 


ACTAAGATAT 


TGCTTGAGCC 


CGAGAGCATG 


TCCAAGGTGA 


4800 


ACCTGAACGG 


GGTGTTTGAC 


ATGACCATGA 


AGATCTGGAA 


GGTGCTGAGG 


TACGATGAGA 


4860 


CCCGCACCAG 


GTGCAGACCC 


TGCGAGTGTG 


GCGGTAAACA 


TATTAGGAAC 


CAGCCTGTGA 


4920 


TGCTGGATGT 


GACCGAGGAG 


CTGAGGCCCG 


ATCACTTGGT 


GCTGGCCTGC 


ACCCGCGCTG 


4980 


AGTTTGGCTC 


TAGCGATGAA 


GATACAGATT 


GAGGTACTGA 


AATGTGTGGG 


CGTGGCTTAA 


5040 


GGGTGGGAAA 


GAATATATAA 


GGTGGGGGTC 


TTATGTAGTT 


TTGTATCTGT 


TTTGCAGCAG 


5100 


CCGCCGCCGC 


CATGAGCACC 


AACTCGTTTG 


ATGGAAGCAT 


TGTGAGCTCA 


TATTTGACAA 


5160 


CGCGCATGCC 


CCCATGGGCC 


GGGGTGCGTC 


AGAATGTGAT 


GGGCTCCAGC 


ATTGATGGTC 


5220 


GCCCCGTCCT 


GCCCGCAAAC 


TCTACTACCT 


TGACCTACGA 


GACCGTGTCT 


GGAACGCCGT 


5280 
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TGGAGACTGC AGCCTCCGCC GCCGCTTCAG CCGCTGCAGC CACCGCCCGC GGGATTGTGA 5 34 0 

CTGACTTTGC TTTCCTGAGC CCGCTTGCAA GCAGTGCAGC TTCCCGTTCA TCCGCCCGCG 54 00 

ATGACAAGTT GACGGCTCTT TTGGCACAAT TGGATTCTTT GACCCGGGAA CTTAATGTCG 54 6 0 

TTTCTCAGCA GCTGTTGGAT CTGCGCCAGC AGGTTTCTGC CCTGAAGGCT TCCTCCCCTC 5 52 0 

CCAATGCGGT TTAAAACATA AATAAAAAAC CAGACTCTGT TTGGATTTGG ATCAAGCAAG 5 58 0 

TGTCTTGCTG TCTTTATTTA GGGGTTTTGC GCGCGCGGTA GGCCCGGGAC CAGCGGTCTC 564 0 

GGTCGTTGAG GGTCCTGTGT ATTTTTTCCA GGACGTGGTA AAGGTGACTC TGGATGTTCA 5 700 

GATACATGGG CATAAGCCCG TCTCTGGGGT GGAGGTAGCA CCACTGCAGA GCTTCATGCT 5 760 

GCGGGGTGGT GTTGTAGATG ATCCAGTCGT AGCAGGAGCG CTGGGCGTGG TGCCTAAAAA 582 0 

TGTCTTTCAG TAGCAAGCTG ATTGCCAGGG GCAGGCCCTT GGTGTAAGTG TTTACAAAGC 58 80 

GGTTAAGCTG GGATGGGTGC ATACGTGGGG ATATGAGATG CATCTTGGAC TGTATTTTTA 5 94 0 

GGTTGGCTAT GTTCCCAGCC ATATCCCTCC GGGGATTCAT GTTGTGCAGA ACCACCAGCA 6000 

CAGTGTATCC GGTGCACTTG GGAAATTTGT CATGTAGCTT AGAAGGAAAT GCGTGGAAGA 606 0 

ACTTGGAGAC GCCCTTGTGA CCTCCAAGAT TTTCCATGCA TTCGTCCATA ATGATGGCAA 612 0 

TGGGCCCACG GGCGGCGGCC TGGGCGAAGA TATTTCTGGG ATCACTAACG TCATAGTTGT 618 0 

GTTCCAGGAT GAGATCGTCA TAGGCCATTT TTACAAAGCG CGGGCGGAGG GTGCCAGACT 62 4 0 

GCGGTATAAT GGTTCCATCC GGCCCAGGGG CGTAGTTACC CTCACAGATT TGCATTTCCC 63 00 

ACGCTTTGAG TTCAGATGGG GGGATCATGT CTACCTGCGG GGCGATGAAG AAAACGGTTT 636 0 

CCGGGGTAGG GGAGATCAGC TGGGAAGAAA GCAGGTTCCT GAGCAGCTGC GACTTACCGC 64 20 

AGCCGGTGGG CCCGTAAATC ACACCTATTA CCGGGTGCAA CTGGTAGTTA AGAGAGCTGC 64 8 0 

AGCTGCCGTC ATCCCTGAGC AGGGGGGCCA CTTCGTTAAG CATGTCCCTG ACTCGCATGT 654 0 

TTTCCCTGAC CAAATCCGCC AGAAGGCGCT CGCCGCCCAG CGATAGCAGT TCTTGCAAGG 66 0 0 

AAGCAAAGTT TTTCAACGGT TTGAGACCGT CCGCCGTAGG CATGCTTTTG AGCGTTTGAC 666 0 

CAAGCAGTTC CAGGCGGTCC CACAGCTCGG TCACCTGCTC TACGGCATCT CGATCCAGCA 6720 

TATCTCCTCG TTTCGCGGGT TGGGGCGGCT TTCGCTGTAC GGCAGTAGTC GGTGCTCGTC 6780 

CAGACGGGCC AGGGTCATGT CTTTCCACGG GCGCAGGGTC CTCGTCAGCG TAGTCTGGGT 684 0 

CACGGTGAAG GGGTGCGCTC CGGGCTGCGC GCTGGCCAGG GTGCGCTTGA GGCTGGTCCT 690 0 

GCTGGTGCTG AAGCGCTGCC GGTCTTCGCC CTGCGCGTCG GCCAGGTAGC ATTTGACCAT 6 96 0 

GGTGTCATAG TCCAGCCCCT CCGCGGCGTG GCCCTTGGCG CGCAGCTTGC CCTTGGAGGA 7 020 

GGCGCCGCAC GAGGGGCAGT GCAGACTTTT GAGGGCGTAG AGCTTGGGCG CGAGAAATAC 7 08 0 

CGATTCCGGG GAGTAGGCAT CCGCGCCGCA GGCCCCGCAG ACGGTCTCGC ATTCCACGAG 7140 

CCAGGTGAGC TCTGGCCGTT CGGGGTCAAA AACCAGGTTT CCCCCATGCT TTTTGATGCG 7200 

TTTCTTACCT CTGGTTTCCA TGAGCCGGTG TCCACGCTCG GTGACGAAAA GGCTGTCCGT 726 0 

GTCCCCGTAT ACAGACTTGA GAGGCCTGTC CTCGAGCGGT GTTCCGCGGT CCTCCTCGTA 732 0 

TAGAAACTCG GACCACTCTG AGACAAAGGC TCGCGTCCAG GCCAGCACGA AGGAGGCTAA 73 8 0 
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GTGGGAGGGG TAGCGGTCGT TGTCCACTAG GGGGTCCACT CGCTCCAGGG TGTGAAGACA 7 44 0 

CATGTC3CCC TCTTCGGCAT CAAGGAAGGT GATTGGTTTG TAGGTGTAGG CCACGTGACC 7 50 0 

GGGTGTTCCT GAAGGGGGGC TATAAAAGGG GGTGGGGGCG CGTTCGTCCT CACTCTCTTC 7 56 0 

CGCATCGCTG TCTGCGAGGG CCAGCTGTTG GGGTGAGTAC TCCCTCTGAA AAGCGGGCAT 76 20 

GACTTCTGCG CTAAGATTGT CAGTTTCCAA AAACGAGGAG GATTTGATAT TCACCTGGCC 76 8 C 

CGCGGTGATG CCTTTGAGGG TGGCCGCATC CATCTGGTCA GAAAAGACAA TCTTTTTGTT 7 74 0 

GTCAAGCTTG GTGGCAAACG ACCCGTAGAG GGCGTTGGAC AGCAACTTGG CGATGGAGCG 7 800 

CAGGGTTTGG TTTTTGTCGC GATCGGCGCG CTCCTTGGCC GCGATGTTTA GCTGCACGTA 7 860 

TTCGCGCGCA ACGCACCGCC ATTCGGGAAA GACGGTGGTG CGCTCGTCGG GCACCAGGTG 7920 

CACGCGCCAA CCGCGGTTGT GCAGGGTGAC AAGGTCAACG CTGGTGGCTA CCTCTCCGCG 7 98 0 

TAGGCGCTCG TTGGTCCAGC AGAGGCGGCC GCCCTTGCGC GAGCAGAATG GCGGTAGGGG 804 0 

GTCTAGCTGC GTCTCGTCCG GGGGGTCTGC GTCCACGGTA AAGACCCCGG GCAGCAGGCG 8100 

CGCGTCGAAG TAGTCTATCT TGCATCCTTG CAAGTCTAGC GCCTGCTGCC ATGCGCGGGC 8160 

GGCAAGCGCG CGCTCGTATG GGTTGAGTGG GGGACCCCAT GGCATGGGGT GGGTGAGCGC 82 20 

GGAGGCGTAC ATGCCGCAAA TGTCGTAAAC GTAGAGGGGC TCTCTGAGTA TTCCAAGATA 8280 

TGTAGGGTAG CATCTTCCAC CGCGGATGCT GGCGCGCACG TAATCGTATA GTTCGTGCGA 834 0 

GGGAGCGAGG AGGTCGGGAC CGAGGTTGCT ACGGGCGGGC TGCTCTGCTC GGAAGACTAT 84 00 

CTGCCTGAAG ATGGCATGTG AGTTGGATGA TATGGTTGGA CGCTGGAAGA CGTTGAAGCT 8460 

GGCGTCTGTG AGACCTACCG CGTCACGCAC GAAGGAGGCG TAGGAGTCGC GCAGCTTGTT 8 520 

GACCAGCTCG GCGGTGACCT GCACGTCTAG GGCGCAGTAG TCCAGGGTTT CCTTGATGAT 8580 

GTCATACTTA TCCTGTCCCT TTTTTTTCCA CAGCTCGCGG TTGAGGACAA ACTCTTCGCG 86 4 0 

GTCTTTCCAG TACTCTTGGA TCGGAAACCC GTCGGCCTCC GAACGGTAAG AGCCTAGCAT 87 00 

GTAGAACTGG TTGACGGCCT GGTAGGCGCA GCATCCCTTT TCTACGGGTA GCGCGTATGC 8 760 

CTGCGCGGCC TTCCGGAGCG AGGTGTGGGT GAGCGCAAAG GTGTCCCTGA CCATGACTTT 882 0 

GAGGTACTGG TATTTGAAGT CAGTGTCGTC GCATCCGCCC TGCTCCCAGA GCAAAAAGTC 8 88 0 

CGTGCGCTTT TTGGAACGCG GATTTGGCAG GGCGAAGGTG ACATCGTTGA AGAGTATCTT 8 940 

TCCCGCGCGA GGCATAAAGT TGCGTGTGAT GCGGAAGGGT CCCGGCACCT CGGAACGGTT 9000 

GTTAATTACC TGGGCGGCGA GCACGATCTC GTCAAAGCCG TTGATGTTGT GGCCCACAAT 9060 

GTAAAGTTCC AAGAAGCGCG GGATGCCCTT GATGGAAGGC AATTTTTTAA GTTCCTCGTA 9120 

GGTGAGCTCT TCAGGGGAGC TGAGCCCGTG CTCTGAAAGG GCCCAGTCTG CAAGATGAGG 918 0 

GTTGGAAGCG ACGAATGAGC TCCACAGGTC ACGGGCCATT AGCATTTGCA GGTGGTCGCG 924 0 

AAAGGTCCTA AACTGGCGAC CTATGGCCAT TTTTTCTGGG GTGATGCAGT AGAAGGTAAG 93 0 0 

CGGGTCTTGT TCCCAGCGGT CCCATCCAAG GTTCGCGGCT AGGTCTCGCG CGGCAGTCAC 9360 

TAGAGGCTCA TCTCCGCCGA ACTTCATGAC CAGCATGAAG GGCACGAGCT GCTTCCCAAA 9420 

GGCCCCCATC CAAGTATAGG TCTCTACATC GTAGGTGACA AAGAGACGCT CGGTGCGAGG 948 0 
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ATGCGAGCCG 
GTGGTGAAAG 
TGCGCAGTAC 
GCGCACAAGG 
TTCTACTTCG 
GACCACCACG 
GACAACATCG 
CGGGAGCTCC 
ATACGTAATT 
CCGCGGCGCG 
TGCATCTAAA 
GGGAGAGGGG 
AGGTTGCTGG 
AAGACGACGG 
TCGTTGACGG 
ATCTCGGCCA 
ACGGTGGCGG 
CCCTCGTTCC 
ACCTGCGCGA 
AAGAGGTAGT 
CGCAACGTGG 
TCCACGGCGA 
AGACGGATGA 
TCTTCTTCTT 
GGGGGAGGGG 
ATCATCTCCC 
CGCAGTTGGA 
GGCAGGGATA 
AGGGACCTGA 
CAGTCACAGT 
TTGTTTCTGG 
ATGGTCGACA 
ATGCCCCAGG 
TCTACCGGCA 
GCGGCGGCGG 



ATCGGGAAGA 
TAGAAGTCCC 
TGGCAGCGGT 
AAGCAGAGTG 
GCTGCTTGTC 
CCGCGCGAGC 
CGCAGATGGG 
TGCAGGTTTA 
TCCAGGGGCT 
ACTACGGTAC 
AGCGGTGACG 
GCAGGGGCAC 
CGAACGCGAC 
GCCCGGTGAG 
CGGCCTGGCG 
TGAACTGCTC 
CGAGGTCGTT 
AGACGCGGCT 
GATTGAGCTC 
TGAGGGTGGT 
ATTCGTTGAT 
AGTTGAAAAA 
GCTCGGCGAC 
CAATCTCCTC 
GGACACGGCG 
CGCGGCGACG 
AGACGCCGCC 
CGGCGCTAAC 
GCGAGTCCGC 
CGCAAGGTAG 
CGGAGGTGCT 
GAAGCACCAT 
CTTCGTTTTG 
CTTCTTCTTC 
AGTTTGGCCG 



ACTGGATCTC 
TGCGACGGGC 
GCACGGGCTG 
GGAATTTGAG 
CTTGACCGTC 
CCAAAGTCCA 
AGCTGTCCAT 
CCTCGCATAG 
GGTTGGTGGC 
CGCGCGGCGG 
CGGGCGAGCC 
GTCGGCGCCG 
GACGCGGCGG 
CTTGAGCCTG 
CAAAATCTCC 
GATCTCTTCC 
GGAAATGCGG 
GTAGACCACG 
CACGTGCCGG 
GGCGGTGTGT 
ATCCCCCAAG 
CTGGGAGTTG 
AGTGTCGCGC 
TTCCATAAGG 
GCGACGACGG 
GCGCATGGTC 
CGTCATGTCC 
GATGCATCTC 
ATCGACCGGA 
GCTGAGCACC 
GCTGATGATG 
GTCCTTGGGT 
ACATCGGCGC 
TCCTTCCTCT 
TAGGTGGCGC 



CCGCCACCAA 
CGAACACTCG 
TACATCCTGC 
CCCCTCGCCT 
TGGCTGCTCG 
GATGTCCGCG 
GGTCTGGAGC 
ACGGGTCAGG 
GGCGTCGATG 
GCGGTGGGCC 
CCCGGAGGTA 
CGCGCGGGCA 
TTGATCTCCT 
AAAGAGAGTT 
TGCACGTCTC 
TCCTGGAGAT 
GCCATGAGCT 
CCCCCTTCGG 
GCGAAGACGG 
TCTGCCACGA 
GCCTCAAGGC 
CGCGCCGACA 
ACCTCGCGCT 
GCCTCCCCTT 
CGCACCGGGA 
TCGGTGACGG 
CGGTTATGGG 
AACAATTGTT 
TCGGAAAACC 
GTGGCGGGCG 
TAATTAAAGT 
CCGGCCTGCT 
AGGTCTTTGT 
TGTCCTGCAT 
CCTCTTCCTC 



TTGGAGGAGT 
TGCTGGCTTT 
ACGAGGTTGA 
GGCGGGTTTG 
AGGGGAGTTA 
CGCGGCGGTC 
TCCCGCGGCG 
GCGCGGGCTA 
GCTTGCAAGA 
GCGGGGGTGT 
GGGGGGGCTC 
GGAGCTGGTG 
GAATCTGGCG 
CGACAGAATC 
CTGAGTTGTC 
CTCCGCGTCC 
GCGAGAAGGC 
CATCGCGGGC 
CGTAGTTTCG 
AGAAGTACAT 
GCTCCATGGC 
CGGTTAACTC 
CAAAGGCTAC 
CTTCTTCTTC 
GGCGGTCGAC 
CGCGGCCGTT 
TTGGCGGGGG 
GTGTAGGTAC 
TCTCGAGAAA 
GCAGCGGGCG 
AGGCGGTCTT 
GAATGCGCAG 
AGTAGTCTTG 
CTCTTGCATC 
CCATGCGTGT 



PCT/US97/19541 

GGCTATTGAT 9 54 0 

TGTAAAAACG 96 00 

CCTGACGACC 96 6 0 

GCTGGTGGTC 97 2 0 

CGGTGGATCG 97 8 0 

GGAGCTTGAT 984 0 

TCAGGTCAGG 9900 

GATCCAGGTG 996 0 

GGCCGCATCC 1002 0 

CCTTGGATGA 100 8 0 

CGGACCCGCC 1014 0 

CTGCGCGCGT 10200 

CCTCTGCGTG 10260 

AATTTCGGTG 103 2 0 

TTGATAGGCG 10380 

GGCTCGCTCC 1044 0 

GTTGAGGCCT 10500 

GCGCATGACC 10560 

CAGGCGCTGA 10620 

AACCCAGCGT 10680 

CTCGTAGAAG 10740 

CTCCTCCAGA 10800 

AGGGGCCTCT 10860 

TGGCGGCGGT 10 920 

AAAGCGCTCG 10980 

CTCGCGGGGG 11040 

GCTGCCATGC 11100 

TCCGCCGCCG 1116 0 

GGCGTCTAAC 1122 0 

GCGGTCGGGG 112 8 0 

GAGACGGCGG 11340 

GCGGTCGGCC 11400 

CATGAGCCTT 1146 0 

TATCGCTGCG 11520 

GACCCCGAAG 11580 
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CCCCTCATCG GCTGAAGCAG GGCTAGGTCG GCGACAACGC GCTCGGCTAA TATGGCCTGC 1164 0 

TGCACCTGCG TGAGGGTAGA CTGGAAGTCA TCCATGTCCA CAAAGCGGTG GTATGCGCCC 11700 

GTGTTGATGG TGTAAGTGCA GTTGGCCATA ACGGACCAGT TAACGGTCTG GTGACCCGGC 1176 0 

TGCGAGAGCT CGGTGTACCT GAGACGCGAG TAAGCCCTCG AGTCAAATAC GTAGTCGTTG 1182 0 

CAAGTCCGCA CCAGGTACTG GTATCCCACC AAAAAGTGCG GCGGCGGCTG GCGGTAGAGG 1188 0 

GGCCAGCGTA GGGTGGCCGG GGCTCCGGGG GCGAGATCTT CCAACATAAG GCGATGATAT 11940 

CCGTAGATGT ACCTGGACAT CCAGGTGATG CCGGCGGCGG TGGTGGAGGC GCGCGGAAAG 12 00 0 

TCGCGGACGC GGTTCCAGAT GTTGCGCAGC GGCAAAAAGT GCTCCATGGT CGGGACGCTC 1206 0 

TGGCCGGTCA GGCGCGCGCA ATCGTTGACG CTCTACCGTG CAAAAGGAGA GCCTGTAAGC 12120 

GGGCACTCTT CCGTGGTCTG GTGGATAAAT TCGCAAGGGT ATCATGGCGG ACGACCGGGG 12180 

TTCGAGCCCC GTATCCGGCC GTCCGCCGTG ATCCATGCGG TTACCGCCCG CGTGTCGAAC 12240 

CCAGGTGTGC GACGTCAGAC AACGGGGGAG TGCTCCTTTT GGCTTCCTTC CAGGCGCGGC 12 300 

GGCTGCTGCG CTAGCTTTTT TGGCCACTGG CCGCGCGCAG CGTAAGCGGT TAGGCTGGAA 1236 0 

AGCGAAAGCA TTAAGTGGCT CGCTCCCTGT AGCCGGAGGG TTATTTTCCA AGGGTTGAGT 12420 

CGCGGGACCC CCGGTTCGAG TCTCGGACCG GCCGGACTGC GGCGAACGGG GGTTTGCCTC 124 80 

CCCGTCATGC AAGACCCCGC TTGCAAATTC CTCCGGAAAC AGGGACGAGC CCCTTTTTTG 12540 

CTTTTCCCAG ATGCATCCGG TGCTGCGGCA GATGCGCCCC CCTCCTCAGC AGCGGCAAGA 12600 

GCAAGAGCAG CGGCAGACAT GCAGGGCACC CTCCCCTCCT CCTACCGCGT CAGGAGGGGC 12 66 0 

GACATCCGCG GTTGACGCGG CAGCAGATGG TGATTACGAA CCCCCGCGGC GCCGGGCCCG 1272 0 

GCACTACCTG GACTTGGAGG AGGGCGAGGG CCTGGCGCGG CTAGGAGCGC CCTCTCCTGA 1278 0 

GCGGTACCCA AGGGTGCAGC TGAAGCGTGA TACGCGTGAG GCGTACGTGC CGCGGCAGAA 12 84 0 

CCTGTTTCGC GACCGCGAGG GAGAGGAGCC CGAGGAGATG CGGGATCGAA AGTTCCACGC 12 900 

AGGGCGCGAG CTGCGGCATG GCCTGAATCG CGAGCGGTTG CTGCGCGAGG AGGACTTTGA 12 960 

GCCCGACGCG CGAACCGGGA TTAGTCCCGC GCGCGCACAC GTGGCGGCCG CCGACCTGGT 13 020 

AACCGCATAC GAGCAGACGG TGAACCAGGA GATTAACTTT CAAAAAAGCT TTAACAACCA 13080 

CGTGCGTACG CTTGTGGCGC GCGAGGAGGT GGCTATAGGA CTGATGCATC TGTGGGACTT 1314 0 

TGTAAGCGCG CTGGAGCAAA ACCCAAATAG CAAGCCGCTC ATGGCGCAGC TGTTCCTTAT 13200 

AGTGCAGCAC AGCAGGGACA ACGAGGCATT CAGGGATGCG CTGCTAAACA TAGTAGAGCC 1326 0 

CGAGGGCCGC TGGCTGCTCG ATTTGATAAA CATCCTGCAG AGCATAGTGG TGCAGGAGCG 13320 

CAGCTTGAGC CTGGCTGACA AGGTGGCCGC CATCAACTAT TCCATGCTTA GCCTGGGCAA 133 80 

GTTTTACGCC CGCAAGATAT ACCATACCCC TTACGTTCCC ATAGACAAGG AGGTAAAGAT 13440 

CGAGGGGTTC TACATGCGCA TGGCGCTGAA GGTGCTTACC TTGAGCGACG ACCTGGGCGT 13 500 

TTATCGCAAC GAGCGCATCC ACAAGGCCGT GAGCGTGAGC CGGCGGCGCG AGCTCAGCGA 13560 

CCGCGAGCTG ATGCACAGCC TGCAAAGGGC CCTGGCTGGC ACGGGCAGCG GCGATAGAGA 13620 

GGCCGAGTCC TACTTTGACG CGGGCGCTGA CCTGCGCTGG GCCCCAAGCC GACGCGCCCT 13680 
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GGAGGCAGCT GGGGCCGGAC CTGGGCTGGC GGTGGCACCC GCGCGCGCTG GCAACGTCGG 1374 0 

CGGCGTGGAG GAATATGACG AGGACGATGA GTACGAGCCA GAGGACGGCG AGTACTAAGC 13800 

GGTGATGTTT CTGATCAGAT GATGCAAGAC GCAACGGACC CGGCGGTGCG GGCGGCGCTG 1386 0 

CAGAGCCAGC CGTCCGGCCT TAACTCCACG GACGACTGGC GCCAGGTCAT GGACCGCATC 13 920 

ATGTCGCTGA CTGCGCGCAA TCCTGACGCG TTCCGGCAGC AGCCGCAGGC CAACCGGCTC 13 980 

TCCGCAATTC TGGAAGCGGT GGTCCCGGCG CGCGCAAACC CCACGCACGA GAAGGTGCTG 1404 0 

GCGATCGTAA ACGCGCTGGC CGAAAACAGG GCCATCCGGC CCGACGAGGC CGGCCTGGTC 14100 

TACGACGCGC TGCTTCAGCG CGTGGCTCGT TACAACAGCG GCAACGTGCA GACCAACCTG 14160 

GACCGGCTGG TGGGGGATGT GCGCGAGGCC GTGGCGCAGC GTGAGCGCGC GCAGCAGCAG 1422 0 

GGCAACCTGG GCTCCATGGT TGCACTAAAC GCCTTCCTGA GTACACAGCC CGCCAACGTG 1428 0 

CCGCGGGGAC AGGAGGACTA CACCAACTTT GTGAGCGCAC TGCGGCTAAT GGTGACTGAG 14 34 0 

ACACCGCAAA GTGAGGTGTA CCAGTCTGGG CCAGACTATT TTTTCCAGAC CAGTAGACAA 144 00 

GGCCTGCAGA CCGTAAACCT GAGCCAGGCT TTCAAAAACT TGCAGGGGCT GTGGGGGGTG 14460 

CGGGCTCCCA CAGGCGACCG CGCGACCGTG TCTAGCTTGC TGACGCCCAA CTCGCGCCTG 14 52 0 

TTGCTGCTGC TAATAGCGCC CTTCACGGAC AGTGGCAGCG TGTCCCGGGA CACATACCTA 1458 0 

GGTCACTTGC TGACACTGTA GCGCGAGGCC ATAGGTCAGG CGCATGTGGA CGAGCATACT 14 64 0 

TTCCAGGAGA TTACAAGTGT CAGCCGCGCG CTGGGGCAGG AGGACACGGG CAGCCTGGAG 14 70 0 

GCAACCCTAA ACTACCTGCT GACCAACCGG CGGCAGAAGA TCCCCTCGTT GCACAGTTTA 14 76 0 

AACAGCGAGG AGGAGCGCAT TTTGCGCTAC GTGCAGCAGA GCGTGAGCCT TAACCTGATG 14 82 0 

CGCGACGGGG TAACGCCCAG CGTGGCGCTG GACATGACCG CGCGCAACAT GGAACCGGGC 14 88 0 

ATGTATGCCT CAAACCGGCC GTTTATCAAC CGCCTAATGG ACTACTTGCA TCGCGCGGCC 14 94 0 

GCCGTGAACC CCGAGTATTT CACCAATGCC ATCTTGAACC CGCACTGGCT ACCGCCCCCT 15000 

GGTTTCTACA CCGGGGGATT CGAGGTGCCC GAGGGTAACG ATGGATTCCT CTGGGACGAC 15 060 

ATAGACGACA GCGTGTTTTC CCCGCAACCG CAGACCCTGC TAGAGTTGCA ACAGCGCGAG 15120 

CAGGCAGAGG CGGCGCTGCG AAAGGAAAGC TTCCGCAGGC CAAGCAGCTT GTCCGATCTA 15180 

GGCGCTGCGG CCCCGCGGTC AGATGCTAGT AGCCCATTTC CAAGCTTGAT AGGGTCTCTT 15240 

ACCAGCACTC GCACCACCCG CCCGCGCCTG CTGGGCGAGG AGGAGTACCT AAACAACTCG 15300 

CTGCTGCAGC CGCAGCGCGA AAAAAACCTG CCTCCGGCAT TTCCCAACAA CGGGATAGAG 1536 0 

AGCCTAGTGG ACAAGATGAG TAGATGGAAG ACGTACGCGC AGGAGCACAG GGACGTGCCA 15420 

GGCCCGCGCC CGCCCACCCG TCGTCAAAGG CACGACCGTC AGCGGGGTCT GGTGTGGGAG 15480 

GACGATGACT CGGCAGACGA CAGCAGCGTC CTGGATTTGG GAGGGAGTGG CAACCCGTTT 15 540 

GCGCACCTTC GCCCCAGGCT GGGGAGAATG TTTTAAAAAA AAAAAAGCAT GATGCAAAAT 15600 

AAAAAACTCA CCAAGGCCAT GGCACCGAGC GTTGGTTTTC TTGTATTCCC CTTAGTATGC 15660 

GGCGCGCGGC GATGTATGAG GAAGGTCCTC CTCCCTCCTA CGAGAGTGTG GTGAGCGCGG 15720 

CGCCAGTGGC GGCGGCGCTG GGTTCTCCCT TCGATGCTCC CCTGGACCCG CCGTTTGTGC 15780 
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CTCCGCGGTA 
CCCTATTCGA 
TGAACTACCA 
GCCCGGGGGA 
ACCTGAAAAC 
AGTTTAAGGC 
AATACGAGTG 
ACCTTATGAA 
TGGAAAGCGA 
TCACTGGTCT 
TGCTGCCAGG 
GCAAGCGGCA 
ACATTCCCGC 
AGGGCGGGGG 
ACGCGGCAGC 
ACACCTTTGC 
CCGCCCCCGC 
TGACAGAGGA 
AGTACCGCAG 
GGACCCTGCT 
CAGACATGAT 
TGGTGGGCGC 
ACTCCCAACT 
ACCAGATTTT 
CTCTCACAGA 
CCATTACTGA 
CGCCGCGCGT 
GCAATAACAC 
GCTCCGACCA 
AACGCGGCCG 
CGCGCAACTA 
TGGTGCGCGG 
GCCACCGCCG 
CACGTCGCAC 
TCACTGTGCC 



CCTGCGGCCT 
CACCACCCGT 
GAACGACCAC 
GGCAAGCACA 
CATGCTGCAT 
GCGGGTGATG 
GGTGGAGTTC 
CAACGCGATC 
CATCGGGGTA 
TGTCATGCCT 
ATGCGGGGTG 
ACCCTTCCAG 
ACTGTTGGAT 
TGGCGCAGGC 
CGCGGCAATG 
CACACGGGCT 
TGCGCAACCC 
CAGCAAGAAA 
CTGGTACCTT 
TTGCACTCCT 
GCAAGACCCC 
CGAGCTGTTG 
CATCCGCCAG 
GGCGCGCCCG 
TCACGGGACG 
CGCCAGACGC 
CCTATCGAGC 
AGGCTGGGGC 
ACACCCAGTG 
CACTGGGCGC 
CACGCCCACG 
AGCCCGGCGC 
CCGACCCGGC 
CGGCCGACGG 
CCCCAGGTCC 



ACCGGGGGGA 
GTGTACCTGG 
AGCAACTTTC 
CAGACCATCA 
ACCAACATGC 
GTGTCGCGCT 
ACGCTGCCCG 
GTGGAGCACT 
AAGTTTGACA 
GGGGTATATA 
GACTTCACCC 
GAGGGCTTTA 
GTGGACGCCT 
GGCAGCAACA 
CAGCCGGTGG 
GAGGAGAAGC 
GAGGTCGAGA 
CGCAGTTACA 
GCATACAACT 
GACGTAACCT 
GTGACCTTCC 
CCCGTGCACT 
TTTACCTCTC 
CCAGCCCCCA 
CTACCGCTGC 
CGCACCTGCC 
CGCACTTTTT 
CTGCGCTTCC 
CGCGTGCGCG 
ACCACCGTCG 
CCGCCACCAG 
TATGCTAAAA 
ACTGCCGCCC 
GCGGCCATGC 
AGGCGACGAG 



GAAACAGCAT 
TGGACAACAA 
TGACCACGGT 
ATCTTGACGA 
CAAATGTGAA 
TGCCTACTAA 
AGGGCAACTA 
ACTTGAAAGT 
CCCGCAACTT 
CAAACGAAGC 
ACAGCCGCCT 
GGATCACCTA 
ACCAGGCGAG 
GCAGTGGCAG 
AGGACATGAA 
GCGCTGAGGC 
AGCCTCAGAA 
ACCTAATAAG 
ACGGCGACCC 
GCGGCTCGGA 
GCTCCACGCG 
CCAAGAGCTT 
TGACCCACGT 
CCATCACCAC 
GCAACAGCAT 
CCTACGTTTA 
GAGCAAGCAT 
CAAGCAAGAT 
GGCACTACCG 
ATGACGCCAT 
TGTCCACAGT 
TGAAGAGACG 
AACGCGCGGC 
GGGCCGCTCG 
CGGCCGCCGC 



CCGTTACTCT 
GTCAACGGAT 
CATTCAAAAC 
CCGGTCGCAC 
CGAGTTCATG 
GGACAATCAG 
CTCCGAGACC 
GGGCAGACAG 
CAGACTGGGG 
CTTCCATCCA 
GAGCAACTTG 
CGATGATCTG 
CTTGAAAGAT 
CGGCGCGGAA 
CGATCATGCC 
CGAAGCAGCG 
GAAACCGGTG 
CAATGACAGC 
TCAGACCGGA 
GCAGGTCTAC 
CCAGATCAGC 
CTACAACGAC 
GTTCAATCGC 
CGTCAGTGAA 
CGGAGGAGTC 
CAAGGCCCTG 
GTCCATCCTT 
GTTTGGCGGG 
CGCGCCCTGG 
CGACGCGGTG 
GGACGCGGCC 
GCGGAGGCGC 
GGCGGCCCTG 
AAGGCTGGCC 
AGCAGCCGCG 
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GAGTTGGCAC 15 840 

GTGGCATCCC 15 900 

AATGACTACA 15 960 

TGGGGCGGCG 16020 

TTTACCAATA 16 08 0 

GTGGAGCTGA 1614 0 

ATGACCATAG 16 200 

AACGGGGTTC 16 26 0 

TTTGACCCCG 1632 0 

GACATCATTT 163 8 0 

TTGGGCATCC 1644 0 

GAGGGTGGTA 16 500 

GACACCGAAC 16560 

GAGAACTCCA 16 620 

ATTCGCGGCG 16 68 0 

GCCGAAGCTG 16 74 0 

ATCAAACCCC 16800 

ACCTTCACCC 16 86 0 

ATCCGCTCAT 16 920 

TGGTCGTTGC 16 98 0 

AACTTTCCGG 17040 

CAGGCCGTCT 17100 

TTTCCCGAGA 1716 0 

AACGTTCCTG 17220 

CAGCGAGTGA 17280 

GGCATAGTCT 17340 

ATATCGCCCA 17400 

GCCAAGAAGC 17460 

GGCGCGCACA 17 520 

GTGGAGGAGG 17 580 

ATTCAGACCG 1764 0 

GTAGCACGTC 17700 

CTTAACCGCG 17760 

GCGGGTATTG 17820 

GCCATTAGTG 17880 
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CTATGACTCA 


GGGTCGCAGG 


GGCAACGTGT 


ATTGGGTGCG 


CGACTCGGTT 


AGCGGCCTGC 


17940 


GCGTGCCCGT 


GCGCACCCGC 


CCCCCGCGCA 


ACTAGATTGC 


AAGAAAAAAC 


TACTTAGACT 


18000 


CGTACTGTTG 


TATGTATCCA 


GCGGCGGCGG 


CGCGCAACGA 


AGCTATGTCC 


AAGCGCAAAA 


18060 


TCAAAGAAGA 


GATGCTCCAG 


GTCATCGCGC 


CGGAGATCTA 


TGGCCCCCCG 


AAGAAGGAAG 


18120 


AGCAGGATTA 


CAAGCCCCGA 


AAGCTAAAGC 


GGGTCAAAAA 


GAAAAAGAAA 


GATGATGATG 


18180 


ATGAACTTGA 


CGACGAGGTG 


GAACTGCTGC 


ACGCTACCGC 


GCCCAGGCGA 


CGGGTACAGT 


18240 


GGAAAGGTCG 


ACGCGTAAAA 


CGTGTTTTGC 


GACCCGGCAC 


CACCGTAGTC 


TTTACGCCCG 


18300 


GTGAGCGCTC 


CACCCGCACC 


TACAAGCGCG 


TGTATGATGA 


GGTGTACGGC 


GACGAGGACC 


18360 


TGCTTGAGCA 


GGCCAACGAG 


CGCCTCGGGG 


AGTTTGCCTA 


CGGAAAGCGG 


CATAAGGACA 


18420 


TGCTGGCGTT 


GCCGCTGGAC 


GAGGGCAACC 


CAACACCTAG 


CCTAAAGCCC 


GTAACACTGC 


18480 


AGCAGGTGCT 


GCCCGCGCTT 


GCACCGTCCG 


AAGAAAAGCG 


CGGCCTAAAG 


CGCGAGTCTG 


18540 


GTGACTTGGC 


ACCCACCGTG 


CAGCTGATGG 


TACCCAAGCG 


CCAGCGACTG 


GAAGATGTCT 


18600 


TGGAAAAAAT 


GACCGTGGAA 


CCTGGGCTGG 


AGCCCGAGGT 


CCGCGTGCGG 


CCAATCAAGC 


18660 


AGGTGGCGCC 


GGGACTGGGC 


GTGCAGACCG 


TGGACGTTCA 


GATACCCACT 


ACCAGTAGCA 


18720 


CCAGTATTGC 


CACCGCCACA 


GAGGGCATGG 


AGACACAAAC 


GTCCCCGGTT 


GCCTCAGCGG 


18780 


TGGCGGATGC 


CGCGGTGCAG 


GCGGTCGCTG 


CGGCCGCGTC 


CAAGACCTCT 


ACGGAGGTGC 


18840 


AAACGGACCC 


GTGGATGTTT 


CGCGTTTCAG 


CCCCCCGGCG 


CCCGCGCGGT 


TCGAGGAAGT 


18900 


ACGGCGCCGC 


CAGCGCGCTA 


CTGCCCGAAT 


ATGCCCTACA 


TCCTTCCATT 


GCGCCTACCC 


18960 


CCGGCTATCG 


TGGCTACACC 


TACCGCCCCA 


GAAGACGAGC 


AACTACCCGA 


CGCCGAACCA 


19020 


CCACTGGAAC 


CCGCCGCCGC 


CGTCGCCGTC 


GCCAGCCCGT 


GCTGGCCCCG 


ATTTCCGTGC 


19080 


GCAGGGTGGC 


TCGCGAAGGA 


GGCAGGACCC 


TGGTGCTGCC 


AACAGCGCGC 


TACCACCCCA 


19140 


GCATCGTTTA 


AAAGCCGGTC 


TTTGTGGTTC 


TTGCAGATAT 


GGCCCTCACC 


TGCCGCCTCC 


19200 


GTTTCCCGGT 


GCCGGGATTC 


CGAGGAAGAA 


TGCACCGTAG 


GAGGGGCATG 


GCCGGCCACG 


19260 


GCCTGACGGG 


CGGCATGCGT 


CGTGCGCACC 


ACCGGCGGCG 


GCGCGCGTCG 


CACCGTCGCA 


19320 


TGCGCGGCGG 


TATCCTGCCC 


CTCCTTATTC 


CACTGATCGC 


CGCGGCGATT 


GGCGCCGTGC 


19380 


CCGGAATTGC 


ATCCGTGGCC 


TTGCAGGCGC 


AGAGACACTG 


ATTAAAAACA 


AGTTGCATGT 


19440 


GGAAAAATCA 


AAATAAAAAG 


TCTGGACTCT 


CACGCTCGCT 


TGGTCCTGTA 


ACTATTTTGT 


19500 


AGAATGGAAG 


ACATCAACTT 


TGCGTCTCTG 


GCCCCGCGAC 


ACGGCTCGCG 


CCCGTTCATG 


19560 


GGAAACTGGC 


AAGATATCGG 


CACCAGCAAT 


ATGAGCGGTG 


GCGCCTTCAG 


CTGGGGCTCG 


19620 


CTGTGGAGCG 


GCATTAAAAA 


TTTCGGTTCC 


ACCGTTAAGA 


ACTATGGCAG 


CAAGGCCTGG 


19680 


AACAGCAGCA 


CAGGCCAGAT 


GCTGAGGGAT 


AAGTTGAAAG 


AGCAAAATTT 


CCAACAAAAG 


19740 


GTGGTAGATG 


GCCTGGCCTC 


TGGCATTAGC 


GGGGTGGTGG 


ACCTGGCCAA 


CCAGGCAGTG 


19800 


CAAAATAAGA 


TTAACAGTAA 


GCTTGATCCC 


CGCCCTCCCG 


TAGAGGAGCC 


TCCACCGGCC 


19860 


GTGGAGACAG 


TGTCTCCAGA 


GGGGCGTGGC 


GAAAAGCGTC 


CGCGCCCCGA 


CAGGGAAGAA 


19920 


ACTCTGGTGA 


CGCAAATAGA 


CGAGCCTCCC 


TCGTACGAGG 


AGGCACTAAA 


GCAAGGCCTG 


19980 
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CCCACCACCC 

ACGCTGGACC 
GCCGTTGTTG 
TCGTTGCGGC 
GGGGTGCAAT 
CATGTATGCG 
AGATGGCTAC 
CCTCGGAGTA 
GCCTGAATAA 
GGTCCCAGCG 
ACAAGGCGCG 
ACTTTGACAT 
CCTACAACGC 
CTGCTCTTGA 
AAGCTGAGCA 
CAAAGGAGGG 
TTCAACCTGA 
CTGGGAGAGT 
CCACAAATGA 
GTCAAGTGGA 
TGACTCCTAA 
TTTCTTACAT 
TGCCCAACAG 
ACAGCACGGG 
ATTTGCAAGA 
GAACCAGGTA 
TTATTGAAAA 
TGATTAATAC 
AAAAAGATGC 
TGGAAATCAA 
ATTTGCCCGA 
CCTACGACTA 
TTGGAGCACG 
ATGCTGGCCT 
TCCAGGTGCC 



GTCCCATCGC 
TGCCTCCCCC 
TAACCCGTCC 
CCGTAGCCAG 
CCCTGAAGCG 
TCCATGTCGC 
CCCTTCGATG 
CCTGAGCCCC 
CAAGTTTAGA 
TTTGACGCTG 
GTTCACCCTA 
CCGCGGCGTG 
CCTGGCTCCC 
AATAAACCTA 
GCAAAAAACT 
TATTCAAATA 
ACCTCAAATA 
CCTTAAAAAG 
AAATGGAGGG 
AATGCAATTT 
AGTGGTATTG 
GCCCACTATT 
GCCTAATTAC 
TAATATGGGT 
CAGAAACACA 
CTTTTCTATG 
TCATGGAACT 
AGAGACTCTT 
TACAGAATTT 
TCTAAATGCC 
CAAGCTAAAG 
CATGAACAAG 
CTGGTCCCTT 
GCGCTACCGC 
TCAGAAGTTC 



GCCCATGGCT 
CGCCGACACC 
TAGCCGCGCG 
TGGCAACTGG 
CCGACGATGC 
CGCCAGAGGA 
ATGCCGCAGT 
GGGCTGGTGC 
AACCCCACGG 
CGGTTCATCC 
GCTGTGGGTG 
CTGGACAGGG 
AAGGGTGCCC 
GAAGAAGAGG 
CACGTATTTG 
GGTGTCGAAG 
GGAGAATCTC 
ACTACCCCAA 
CAAGGCATTC 
TTCTCAACTA 
TACAGTGAAG 
AAGGAAGGTA 
ATTGCTTTTA 
GTTCTGGCGG 
GAGCTTTCAT 
TGGAATCAGG 
GAAGATGAAC 
ACCAAGGTAA 
TCAGATAAAA 
AACCTGTGGA 
TACAGTCCTT 
CGAGTGGTGG 
GACTATATGG 
TCAATGTTGC 
TTTGCCATTA 



ACCGGAGTGC 
CAGCAGAAAC 
TCCCTGCGCC 
CAAAGCACAC 
TTCTGAATAG 
GCTGCTGAGC 
GGTCTTACAT 
AGTTTGCCCG 
TGGCGCCTAC 
CTGTGGACCG 
ATAACCGTGT 
GCCCTACTTT 
CAAATCCTTG 
ACGATGACAA 
GGCAGGCGCC 
GTCAAACACC 
AGTGGTACGA 
TGAAACCATG 
TTGTAAAGCA 
CTGAGGCGAC 
ATGTAGATAT 
ACTCACGAGA 
GGGACAATTT 
GCCAAGCATC 
ACCAGCTTTT 
CTGTTGACAG 
TTCCAAATTA 
AACCTAAAAC 
ATGAAATAAG 
GAAATTTCCT 
CCAACGTAAA 
CTCCCGGGTT 
ACAACGTCAA 
TGGGCAATGG 
AAAACCTCCT 



TGGGCCAGCA 
CTGTGCTGCC 
GCGCCGCCAG 
TGAACAGCAT 
CTAACGTGTC 
CGCCGCGCGC 
GCACATCTCG 
CGCCACCGAG 
GCACGACGTG 
TGAGGATACT 
GCTGGACATG 
TAAGCCCTAC 
CGAATGGGAT 
CGAAGACGAA 
TTATTCTGGT 
TAAATATGCC 
AACTGAAATT 
TTACGGTTCA 
ACAAAATGGA 
CGCAGGCAAT 
AGAAACCCCA 
ACTAATGGGC 
TATTGGTCTA 
GCAGTTGAAT 
GCTTGATTCC 
CTATGATCCA 
CTGCTTTCCA 
AGGTCAGGAA 
AGTTGGAAAT 
GTACTCCAAC 
AATTTCTGAT 
AGTGGACTGC 
CCCATTTAAC 
TCGCTATGTG 
TCTCCTGCCG 



PCT/US97/19541 

CACACCCGTA 2 0040 

AGGCCCGACC 2 010 0 

CGGTCCGCGA 2 016 0 

CGTGGGTCTG 2 0220 

GTATGTGTGT 2 0280 

CCGCTTTCCA 2 0340 

GGCCAGGACG 2 0400 

ACGTACTTCA 2 0460 

ACCACAGACC 2 0520 

GCGTACTCGT 2 0580 

GCTTCCACGT 2 0640 

TCTGGCACTG 2 0700 

GAAGCTGCTA 20760 

GTAGACGAGC 2 082 0 

ATAAATATTA 20880 

G ATAAAACAT 2 094 0 

AATCATGCAG 2100 0 

TATGCAAAAC 21060 

AAGCTAGAAA 2112 0 

GGTGATAACT 2118 0 

GACACTCATA 2124 0 

CAACAATCTA 213 00 

ATGTATTACA 2136 0 

GCTGTTGTAG 2142 0 

ATTGGTGATA 214 8 0 

GATGTTAGAA 21540 

CTGGGAGGTG 21500 

AATGGATGGG 2166 0 

AATTTTGCCA 2172 0 

ATAGCGCTGT 217 8 0 

AACCCAAACA 21840 

TACATTAACC 21900 

CACCACCGCA 21960 

CCCTTCCACA 22020 

GGCTCATACA 22 080 
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CCTACGAGTG 

ACCTAAGGGT 
TCCCCATGGC 
ACCAGTCCTT 
CTACCAACGT 
TCACGCGCCT 
CCTACTCTGG 
AGGTGGCCAT 
CCAACGAGTT 
ACATGACCAA 
GCTTCTATAT 
CCATGAGCCG 
TACACCAACA 
AGGCCTACCC 
CCCAGAAAAA 
TGTCCATGGG 
CGCTAGACAT 
TTGAAGTCTT 
ACCTGCGCAC 
AACAGCTGCC 
TTGTGGGCCA 
CAAGCTCGCC 
GGCCTTTGCC 
TGACCAGCGA 
CATTGCTTCT 
GCCCAACTCG 
GCCCCAAACT 
CATGCTCAAC 
CTTCCTGGAG 
TTCTTTTTGT 
CAAATGCTTT 
CGTTTAAAAA 
GCGATACTGG 
GAAGTTTTCA 
TATCTTGAAG 



GAACTTCAGG 

TGACGGAGCC 

CCACAACACC 

TAACGACTAT 

GCCCATATCC 

TAAGACTAAG 

CTCTATACCC 

TACCTTTGAC 

TGAAATTAAG 

AGACTGGTTC 

CCCAGAGAGC 

TCAGGTGGTG 

CAACAACTCT 

TGCTAACTTC 

GTTTCTTTGC 

CGCACTCACA 

GACTTTTGAG 

TGACGTGGTC 

GCCCTTCTCG 

GCCATGGGCT 

TATTTTTTGG 

TGCGCCATAG 

TGGAACCCGC 

CTCAAGCAGG 

TCCCCCGACC 

GCCGCCTGTG 

CCCATGGATC 

AGTCCCCAGG 

CGCCACTCGC 

CACTTGAAAA 

TATTTGTACA 

TCAAAGGGGT 

TGTTTAGTGC 

CTCCACAGGC 

TCGCAGTTGG 



AAGGATGTTA 
AGCATTAAGT 
GCCTCCACGC 
CTCTCCGCCG 
ATCCCCTCCC 
GAAACCCCAT 
TACCTAGATG 
TCTTCTGTCA 
CGCTCAGTTG 
CTGGTACAAA 
TACAAGGACC 
GATGATACTA 
GGATTTGTTG 
CCCTATCCGC 
GATCGCACCC 
GACCTGGGCC 
GTGGATCCCA 
CGTGTGCACC 
GCCGGCAACG 
CCAGTGAGCA 
GCACCTATGA 
TCAATACGGC 
ACTCAAAAAC 
TTTACCAGTT 
GCTGTATAAC 
GACTATTCTG 
ACAACCCCAC 
TACAGCCCAC 
CCTACTTCCG 
ACATGTAAAA 
CTCTCGGGTG 
TCTGCCGCGC 
TCCACTTAAA 
TGCGCACCAT 
GGCCTCCGCC 



ACATGGTTCT 
TTGATAGCAT 
TTGAGGCCAT 
CCAACATGCT 
GCAACTGGGC 
CACTGGGCTC 
GAACCTTTTA 
GCTGGCCTGG 
ACGGGGAGGG 
TGCTAGCTAA 
GCATGTACTC 
AATACAAGGA 
GCTACCTTGC 
TTATAGGCAA 
TTTGGCGCAT 
AAAACCTTCT 
TGGACGAGCC 
GGCCGCACCG 
CCACAACATA 
GGAACTGAAA 
CAAGCGCTTT 
CGGTCGCGAG 
ATGCTACCTC 
TGAGTACGAG 
GCTGGAAAAG 
CTGCATGTTT 
CATGAACCTT 
CCTGCGTCGC 
CAGCCACAGT 
ATAATGTACT 
ATTATTTACC 
ATCGCTATGC 
CTCAGGCACA 
CACCAACGCG 
CTGCGCGCGC 



GCAGAGCTCC 
TTGCCTTTAC 
GCTTAGAAAC 
CTACCCTATA 
GGCTTTCCGC 
GGGCTACGAC 
CCTCAACCAC 
CAATGACCGC 
TTACAACGTT 
CTACAACATT 
CTTCTTTAGA 
CTACCAACAG 
CCCCACCATG 
GACCGCAGTT 
CCCATTCTCC 
CTACGCCAAC 
CACCCTTCTT 
CGGCGTCATC 
AAGAAGCAAG 
GCCATTGTCA 
CCAGGCTTTG 
ACTGGGGGCG 
TTTGAGCCCT 
TCACTCCTGC 
TCCACCCAAA 
CTCCACGCCT 
ATTACCGGGG 
AACCAGGAAC 
GCGCAGATTA 
AGAGACACTT 
CCCACCCTTG 
GCCACTGGCA 
ACCATCCGCG 
TTTAGCAGGT 
GAGTTGCGAT 



PCTAJS97/19541 

CTAGGAAATG 2214 0 

GCCACCTTCT 22200 

GACACCAACG 2226 0 

CCCGCCAACG 22 320 

GGCTGGGCCT 22 3 80 

CCTTATTACA 2244 0 

ACCTTTAAGA 22 5 00 

CTGCTTACCC 2 2 560 

GCCCAGTGTA 2262 0 

GGCTACCAGG 226 80 

AACTTCCAGC 22 740 

GTGGGCATCC 2 2800 

CGCGAAGGAC 22 86 0 

GACAGCATTA 22 92 0 

AGTAACTTTA 22 980 

TCCGCCCACG 2 3 040 

TATGTTTTGT 23100 

GAAACCGTGT 2316 0 

CAACATCAAC 23 220 

AAGATCTTGG 2 3 280 

TTTCTCCACA 23 34 0 

TACACTGGAT 23 400 

TTGGCTTTTC 2 3460 

GCCGTAGCGC 23 520 

GCGTACAGGG 23 58 0 

TTGCCAACTG 23 640 

TACCCAACTC 23 700 

AGCTCTACAG 23 760 

GGAGCGCCAC 23 82 0 

TCAATAAAGG 23 880 

CCGTCTGCGC 23 940 

GGGACACGTT 24 00 0 

GCAGCTCGGT 24 06 0 

CGGGCGCCGA 2412 0 

ACACAGGGTT 2418 0 
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GCAGCACTGG 
GATCAGATCC 
CTGCCTTCCC 
CAAAAGGTGA 
CTGCTTAAAA 
GGAAAACTGA 
GATCTGCACC 
CTTCAGCGCG 
TATCATAATG 
CCACAACGCG 
GTACGCCTGC 
CTGCAACCCG 
TTGGTCAGGC 
CAGCGCGCGC 
GTTCATCACC 
CATACCACGC 
GCCATGCTTG 
TCTTTCTTCC 
AGGGCGCTTC 
CGGGCTGGGT 
GATACGCCGC 
GGACGACACG 
GGTTTCGCGC 
CATGGAGTCA 
CTCCACCGAT 
GGAGGA.AGTG 
AGTACCAACA 
CGGGCGGGGG 
GCATCTGCAG 
CCTCGCCATA 
CCCCAAACGC 
ATTTGCCGTG 
CCTATCCTGC 
TGTCATACCT 
CGACGAGAAG 



AACACTATCA 
GCGTCCAGGT 
AAAAAGGGCG 
CCGTGCCCGG 
GCCACCTGAG 
TTGGCCGGAC 
ACATTTCGGC 
CGCTGCCCGT 
CTTCCGTGTA 
CAGCCCGTGG 
AGGAATCGCC 
CGGTGCTCCT 
AGTAGTTTGA 
GCAGCCTCCA 
GTAATTTCAC 
GCCACTGGGT 
ATTAGCACCG 
TCGCTGTCCA 
TTTTTCTTCT 
GTGCGCGGCA 
CTCATCCGCT 
TCCTCCATGG 
TGCTCCTCTT 
GTCGAGAAGA 
GCCGCCAACG 
ATTATCGAGC 
GAGGATAAAA 
GACGAAAGGC 
CGCCAGTGCG 
GCGGATGTCA 
CAAGAAAACG 
CCAGAGGTGC 
CGTGCCAACC 
GATATCGCCT 
CGCGCGGCAA 



GCGCCGGGTG 
CCTCCGCGTT 
CGTGCCCAGG 
TCTGGGCGTT 
CCTTTGCGCC 
AGGCCGCGTC 
CCCACCGGTT 
TTTCGCTCGT 
GACACTTAAG 
GCTCGTGATG 
CCATCATCGT 
CGTTCAGCCA 
AGTTCGCCTT 
TGCCCTTCTC 
TTTCCGCTTC 
CGTCTTCATT 
GTGGGTTGCT 
CGATTACCTC 
TGGGCGCAAT 
CCAGCGCGTC 
TTTTTGGGGG 
TTGGGGGACG 
CCCGACTGGC 
AGGACAGCCT 
CGCCTACCAC 
AGGACCCAGG 
AGCAAGACCA 
ATGGCGACTA 
CC ATTATCTG 
GCCTTGCCTA 
GCACATGCGA 
TTGCCACCTA 
GCAGCCGAGC 
CGCTCAACGA 
ACGCTCTGCA 



GTGCACGCTG 
GCTCAGGGCG 
CTTTGAGTTG 
AGGATACAGC 
TTCAGAGAAG 
GTGCACGCAG 
CTTCACGATC 
CACATCCATT 
CTCGCCTTCG 
CTTGTAGGTC 
CACAAAGGTC 
GGTCTTGCAT 
TAGATCGTTA 
CCACGCAGAC 
GCTGGGCTCT 
CAGCCGCCGC 
GAAACCCACC 
TGGTGATGGC 
GGCCAAATCC 
TTGTGATGAG 
CGCCCGGGGA 
TCGCGCCGCA 
CATTTCCTTC 
AACCGCCCCC 
CTTCCCCGTC 
TTTTGTAAGC 
GGACAACGCA 
CCTAGATGTG 
CGACGCGTTG 
CGAACGCCAC 
GCCCAACCCG 
TCACATCTTT 
GGACAAGCAG 
AGTGCCAAAA 
ACAGGAAAAC 



GCCAGCACGC 
AACGGAGTCA 
CACTCGCACC 
GCCTGCATAA 
AACATGCCGC 
CACCTTGCGT 
TTGGCCTTGC 
TCAATCACGT 
ATCTCAGCGC 
ACCTCTGCAA 
TTGTTGCTGG 
ACGGCCGCCA 
TCCACGTGGT 
ACGATCGGCA 
TCCTCTTCCT 
ACTGTGCGCT 
ATTTGTAGCG 
GGGCGCTCGG 
GCCGCCGAGG 
TCTTCCTCGT 
GGCGGCGGCG 
CCGCGTCCGC 
TCCTATAGGC 
TCTGAGTTCG 
GAGGCACCCC 
GAAGACGACG 
GAGGCAAACG 
GGAGACGACG 
CAAGAGCGCA 
CTATTCTCAC 
CGCCTCAACT 
TTCCAAAACT 
CTGGCCTTGC 
ATCTTTGAGG 
AGCGAAAATG 



PCT/US97/19541 

TCTTGTCGGA 2424 0 

ACTTTGGTAG 2 4 3 0 0 

GTAGTGGCAT 2 4 360 

AAGCCTTGAT 2 4420 

AAGACTTGCC 2 4480 

CGGTGTTGGA 24 540 

TAGACTGCTC 246 00 

GCTCCTTATT 2 4 660 

AGCGGTGCAG 24 720 

ACGACTGCAG 247 8 0 

TGAAGGTCAG 2 4840 

GAGCTTCCAC 24 900 

ACTTGTCCAT 24 960 

CACTCAGCGG 2 5 020 

CTTGCGTCCG 2 5080 

TACCTCCTTT 2 514 0 

CCACATCTTC 2 5200 

GCTTGGGAGA 2 52 60 

TCGATGGCCG 2 5320 

CCTCGGACTC 2 5380 

ACGGGGACGG 2 5440 

GCTCGGGGGT 2 5 500 

AGAAAAAGAT 2 5560 

CCACCACCGC 2 5620 

CGCTTGAGGA 2 5680 

AGGACCGCTC 2 5740 

AGGAACAAGT 2 5800 

TGCTGTTGAA 2 5860 

GCGATGTGCC 2 5920 

CGCGCGTACC 2 5980 

TCTACCCCGT 26 040 

GCAAGATACC 2 6100 

GGCAGGGCGC 2 6160 

GTCTTGGACG 26220 

AAAGTCACTC 2 6280 
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TGGAGTGTTG 
CGAGGTCACC 
CATGAGTGAG 
ACAAACAGAG 
GCGCGAGCCT 
CGTGGAGCTT 
GGAAACATTG 
CGTGGAGCTC 
AAACGTGCTT 
TTAGTTATTT 
GGAGTGCAAC 
GGCCTTCAAC 
GCTTAAAACC 
TAGGAACTTT 
CGACTTTGTG 
TCTGCAGCTA 
CGGTCTACTG 
CAATTCGCAG 
GCCTGACGAA 
TTACCTTCGC 
CCAATCCCGC 
TGGCCAATTG 
GGTTTACTTG 
CTATCAGCAG 
TGCCGCCGCC 
ACGAGGAGGA 
TCGAAGAGGT 
AGAAATCGGC 
TGCCCGTTCG 
AGCAGCCGCC 
GGCACAAGAA 
GCCGCTTTCT 
GTCATCTCTA 
ACACAGAAGC 
GCGGCAGCAG 



GTGGAACTCG 
CACTTTGCCT 
CTGATCGTGC 
GAGGGCCTAC 
GCCGACTTGG 
GAGTGCATGC 
CACTACACCT 
TGCAACCTGG 
CATTCCACGC 
CTATGCTACA 
CTCAAGGAGC 
GAGCGCTCCG 
CTGCAACAGG 
ATCCTAGAGC 
CCCATTAAGT 
GCCAACTACC 
GAGTGTCACT 
CTGCTTAACG 
AAGTCCGCGG 
AAATTTGTAC 
CCGCCAAATG 
CAAGCCATCA 
GACCCCCAGT 
CAGCCGCGGG 
ACCCACGGAC 
GGAGGACATG 
GTCAGACGAA 
AACCGGTTCC 
CCGACCCAAC 
GCCGTTAGCC 
CGCCATAGTT 
TCTCTACCAT 
CAGCCCATAC 
AAAGGCGACC 
CAGGAGGAGG 



AGGGTGACAA 
ACCCGGCACT 
GCCGTGCGCA 
CCGCAGTTGG 
AGGAGCGACG 
AGCGGTTCTT 
TTCGACAGGG 
TCTCCTACCT 
TCAAGGGCGA 
GCTGGCAGAC 
TGCAGAAACT 
TGGCCGCGCA 
GTCTGCCAGA 
GCTCAGGAAT 
ACCGCGAATG 
TTGCCTACCA 
GTCGCTGCAA 
AAAGTCAAAT 
CTCCGGGGTT 
CTGAGGACTA 
CGGAGCTTAC 
ACAAAGCCCG 
CCGGCGAGGA 
CCCTTGCTTC 
GAGGAGGAAT 
ATGG^iAGACT 
ACACCGTCAC 
AGCATGGCTA 
CGTAGATGGG 
CAAGAGCAAC 
GCTTGCTTGC 
CACGGCGTGG 
TGCACCGGCG 
GGATAGCAAG 
AGCGCTGCGT 



CGCGCGCCTA 
TAACCTACCC 
GCCCCTGGAG 
CGACGAGCAG 
CAAACTAATG 
TGCTGACCCG 
CTACGTACGC 
TGGAATTTTG 
GGCGCGCCGC 
GGCCATGGGC 
GCTAAAGCAA 
CCTGGCGGAC 
CTTCACCAGT 
CTTGCCCGCC 
CCCTCCGCCG 
CTCTGACATA 
CCTATGCACC 
TATCGGTACC 
GAAACTCACT 
CCACGCCCAC 
CGCCTGCGTC 
CCAAGAGTTT 
GCTCAACCCA 
CCAGGATGGC 
ACTGGGACAG 
GGGAGAGCCT 
CCTCGGTCGC 
CAACCTCCGC 
ACACCACTGG 
AACAGCGCCA 
AAGACTGTGG 
CCTTCCCCCG 
GCAGCGGCAG 
ACTCTGACAA 
CTGGCGCCCA 



GCCGTACTAA 
CCCAAGGTCA 
AGGGATGCAA 
CTAGCGCGCT 
ATGGCCGCAG 
GAGATGCAGC 
CAGGCCTGCA 
CACGAAAACC 
GACTACGTCC 
GTTTGGCAGC 
AACTTGAAGG 
ATCATTTTCC 
CAAAGCATGT 
ACCTGCTGTG 
CTTTGGGGCC 
ATGGAAGACG 
CCGCACCGCT 
TTTGAGCTGC 
CCGGGGCTGT 
GAGATTAGGT 
ATTACCCAGG 
CTGCTACGAA 
ATCCCCCCGC 
ACCCAAAAAG 
TCAGGCAGAG 
AGACGAGGAA 
ATTCCCCTCG 
TCCTCAGGCG 
AACCAGGGCC 
AGGCTACCGC 
GGGCAACATC 
TAACATCCTG 
CGGCAGCAAC 
AGCCCAAGAA 
ACGAACCCGT 



PCT/US97/19541 

AACGCAGCAT 26 34 0 

TGAGCACAGT 26 4 00 

ATTTGCAAGA 26460 

GGCTTCAAAC 26520 

TGCTCGTTAC 26 58 0 

GCAAGCTAGA 2664 0 

AGATCTCCAA 26 7 00 

GCCTTGGGCA 267 60 

GCGACTGCGT 26820 

AGTGCTTGGA 26880 

ACCTATGGAC 26 94 0 

CCGAACGCCT 27000 

TGCAGAACTT 2 7060 

CACTTCCTAG 27120 

ACTGCTACCT 27180 

TGAGCGGTGA 2 724 0 

CCCTGGTTTG 273 00 

AGGGTCCCTC 2 7360 

GGACGTCGGC 27420 

TCTACGAAGA 27480 

GCCACATTCT 2 7540 

AGGGACGGGG 2 7600 

CGCCGCAGCC 2 7660 

AAGCTGC AGO 2 7720 

GAGGTTTTGG 2 77 80 

GCTTCCGAGG 2 7840 

CCGGCGCCCC 27900 

CCGCCGGCAC 2 7960 

GGTAAGTCCA 28020 

TCATGGCGCG 2 8080 

TCCTTCGCCC 28140 

CATTACTACC 2 8200 

AGCAGCGGCC 28260 

ATCCACAGCG 28320 

ATCGACCCGC 28380 
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GAGCTTAGAA 


ACAGGATTTT 


TCCCACTCTG 


TATGCTATAT 


TTCAACAGAG 


CAGGGGCCAA 


28440 


GAACAAGAGC 


TGAAAATAAA 


AAACAGGTCT 


CTGCGATCCC 


TCACCCGCAG 


CTGCCTGTAT 


28500 


CACAAAAGCG 


AAGATCAGCT 


TCGGCGCACG 


CTGGAAGACG 


CGGAGGCTCT 


CTTCAGTAAA 


28560 


TACTGCGCGC 


TGACTCTTAA 


GGACTAGTTT 


CGCGCCCTTT 


CTCAAATTTA 


AGCGCGAAAA 


28620 


CTACGTCATC 


TCCAGCGGCC 


ACACCCGGCG 


CCAGCACCTG 


TCGTCAGCGC 


CATTATGAGC 


28680 


AAGGAAATTC 


CCACGCCCTA 


CATGTGGAGT 


TACCAGCCAC 


AAATGGGACT 


TGCGGCTGGA 


28740 


GCTGCCCAAG 


ACTACTCAAC 


CCGAATAAAC 


TACATGAGCG 


CGGGACCCCA 


CATGATATCC 


28800 


CGGGTCAACG 


GAATCCGCGC 


CCACCGAAAC 


CGAATTCTCT 


TGGAACAGGC 


GGCTATTACC 


28860 


ACCACACCTC 


GTAATAACCT 


TAATCCCCGT 


AGTTGGCCCG 


CTGCCCTGGT 


GTACCAGGAA 


28920 


AGTCCCGCTC 


CCACCACTGT 


GGTACTTCCC 


AGAGACGCCC 


AGGCCGAAGT 


TCAGATGACT 


28980 


AACTCAGGGG 


CGCAGCTTGC 


GGGCGGCTTT 


CGTCACAGGG 


TGCGGTCGCC 


CGGGCAGGGT 


29040 


ATAACTCACC 


TGACAATCAG 


AGGGCGAGGT 


ATTCAGCTCA 


ACGACGAGTC 


GGTGAGCTCC 


29100 


TCGCTTGGTC 


TCCGTCCGGA 


CGGGACATTT 


CAGATCGGCG 


GCGCCGGCCG 


TCCTTCATTC 


29160 


ACGCCTCGTC 


AGGCAATCCT 


AACTCTGCAG 


ACCTCGTCCT 


CTGAGCCGCG 


CTCTGGAGGC 


29220 


ATTGGAACTC 


TGCAATTTAT 


TGAGGAGTTT 


GTGCCATCGG 


TCTACTTTAA 


CCCCTTCTCG 


2 9280 


GGACCTCCCG 


GCCACTATCC 


GGATCAATTT 


ATTCCTAACT 


TTGACGCGGT 


AAAGGACTCG 


29340 


GCGGACGGCT 


ACGACTGAAT 


GTTAATTAAG 


TTCCTGTCCA 


TCCGCACCCA 


CTATCTTCAT 


29400 


GTTGTTGCAG 


ATGAAGCGCG 


CAAGACCGTC 


TGAAGATACC 


TTCAACCCCG 


TGTATCCATA 


29460 


TGACACGGAA 


ACCGGTCCTC 


CAACTGTGCC 


TTTTCTTACT 


CCTCCCTTTG 


TATCCCCC;^ 


29520 


TGGGTTTCAA 


GAGAGTCCCC 


CTGGGGTACT 


CTCTTTGCGC 


CTATCCGAAC 


CTCTAGTTAC 


29580 


CTCCAATGGC 


ATGCTTGCGC 


TCAAAATGGG 


CAACGGCCTC 


TCTCTGGACG 


AGGCCGGCAA 


29640 


CCTTACCTCC 


CAAAATGTAA 


CCACTGTGAG 


CCCACCTCTC 


AAAAAAACCA 


AGTCAAACAT 


29700 


AAACCTGGAA 


ATATCTGCAC 


CCCTCACAGT 


TACCTCAGAA 


GCCCTAACTG 


TGGCTGCCGC 


29760 


CGCACCTCTA 


ATGGTCGCGG 


GCAACACACT 


CACCATGCAA 


TCACAGGCCC 


CGCTAACCGT 


29820 


GCACGACTCC 


AAACTTAGCA 


TTGCCACCCA 


AGGACCCCTC 


ACAGTGTCAG 


AAGGAAAGCT 


29880 


AGCCCTGCAA 


ACATCAGGCC 


CCCTCACCAC 


CACCGATAGC 


AGTACCCTTA 


CTATCACTGC 


29940 


CTCACCCCCT 


CTAACTACTG 


CCACTGGTAG 


CTTGGGCATT 


GACTTGAAAG 


AGCCCATTTA 


30000 


TACACAAAAT 


GG AAAACTAG 


GACTAAAGTA 


CGGGGCTCCT 


TTGCATGTAA 


CAGACGACCT 


30060 


AAACACTTTG 


ACCGTAGCAA 


CTGGTCCAGG 


TGTGACTATT 


AATAATACTT 


CCTTGCAAAC 


30120 


TAAAGTTACT 


GGAGCCTTGG 


GTTTTGATTC 


ACAAGGCAAT 


ATGCAACTTA 


ATGTAGCAGG 


30180 


AGGACTAAGG 


ATTGATTCTC 


AAAACAGACG 


CCTTATACTT 


GATGTTAGTT 


ATCCGTTTGA 


30240 


TGCTCAAAAC 


CAACTAAATC 


TAAGACTAGG 


ACAGGGCCCT 


CTTTTTATAA 


ACTCAGCCCA 


30300 


CAACTTGGAT 


ATTAACTACA 


ACAAAGGCCT 


TTACTTGTTT 


ACAGCTTCAA 


ACAATTCCAA 


30360 


AAAGCTTGAG 


GTTAACCTAA 


GCACTGCCAA 


GGGGTTGATG 


TTTGACGCTA 


CAGCCATAGC 


30420 


CATTAATGCA 


GGAGATGGGC 


TTGAATTTGG 


TTCACCTAAT 


GCACCAAACA 


CAAATCCCCT 


30480 
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CAAAACAAAA 


ATTGGCCATG 


GCCTAGAATT 


TGATTCAAAC 


AAGGCTATGG 


TTCCTAAACT 


30540 


AGGAACTGGC 


CTTAGTTTTG 


ACAGCACAGG 


TGCCATTACA 


GTAGGAAACA 


AAAATAATGA 


30600 


TAAGCTAACT 


TTGTGGACCA 


CACCAGCTCC 


ATCTCCTAAC 


TGTAGACTAA 


ATGCAGAGAA 


30660 


AGATGCTAAA 


CTCACTTTGG 


TCTTAACAAA 


ATGTGGCAGT 


CAAATACTTG 


CTACAGTTTC 


30720 


AGTTTTGGCT 


GTTAAAGGCA 


GTTTGGCTCC 


AATATCTGGA 


ACAGTTCAAA 


GTGCTCATCT 


30780 


TATTATAAGA 


TTTGACGAAA 


ATGGAGTGCT 


ACTAAACAAT 


TCCTTCCTGG 


ACCCAGAATA 


30840 


TTGGAACTTT 


AGAAATGGAG 


ATCTTACTGA 


AGGCACAGCC 


TATACAAACG 


CTGTTGGATT 


30900 


TATGCCTAAC 


CTATCAGCTT 


ATCCAAAATC 


TCACGGTAAA 


ACTGCCAAAA 


GTAACATTGT 


30960 


CAGTCAAGTT 


TACTTAAACG 


GAGACAAAAC 


TAAACCTGTA 


ACACTAACCA 


TTACACTAAA 


31020 


CGGTACACAG 


GAAACAGGAG 


ACACAACTCC 


AAGTGCATAC 


TCTATGTCAT 


TTTCATGGGA 


31080 


CTGGTCTGGC 


CACAACTACA 


TTAATGAAAT 


ATTTGCCACA 


TCCTCTTACA 


CTTTTTCATA 


31140 


CATTGCCCAA 


GAATAAAGAA 


TCGTTTGTGT 


TATGTTTCAA 


CGTGTTTATT 


TTTCAATTGC 


31200 


AGAAAATTTC 


AAGTCATTTT 


TCATTCAGTA 


GTATAGCCCC 


ACCACCACAT 


AGCTTATACA 


31260 


GATCACCGTA 


CCTTAATCAA 


ACTCACAGAA 


CCCTAGTATT 


CAACCTGCCA 


CCTCCCTCCC 


31320 


AACACACAGA 


GTACACAGTC 


CTTTCTCCCC 


GGCTGGCCTT 


AAAAAGCATC 


ATATCATGGG 


31380 


TAACAGACAT 


ATTCTTAGGT 


GTTATATTCC 


ACACGGTTTC 


CTGTCGAGCC 


AAACGCTCAT 


31440 


CAGTGATATT 


AATAAACTCC 


CCGGGCAGCT 


CACTTAAGTT 


CATGTCGCTG 


TCCAGCTGCT 


31500 


GAGCCACAGG 


CTGCTGTCCA 


ACTTGCGGTT 


GCTTAACGGG 


CGGCGAAGGA 


GAAGTCCACG 


31560 


CCTACATGGG 


GGTAGAGTCA 


TAATCGTGCA 


TCAGGATAGG 


GCGGTGGTGC 


TGCAGCAGCG 


31620 


CGCGAATAAA 


CTGCTGCCGC 


CGCCGCTCCG 


TCCTGCAGGA 


ATACAACATG 


GCAGTGGTCT 


31680 


CCTCAGCGAT 


GATTCGCACC 


GCCCGCAGCA 


TAAGGCGCCT 


TGTCCTCCGG 


GCACAGCAGC 


31740 


GCACCCTGAT 


CTCACTTAAA 


TCAGCACAGT 


AACTGCAGCA 


CAGCACCACA 


ATATTGTTCA 


31800 


AAATCCCACA 


GTGCAAGGCG 


CTGTATCCAA 


AGCTCATGGC 


GGGGACCACA 


GAACCCACGT 


31860 


GGCCATCATA 


CCACAAGCGC 


AGGTAGATTA 


AGTGGCGACC 


CCTCATAAAC 


ACGCTGGACA 


31920 


TAAACATTAC 


CTCTTTTGGC 


ATGTTGTAAT 


TCACCACCTC 


CCGGTACCAT 


ATAAACCTCT 


31980 


GATTAAACAT 


GGCGCCATCC 


ACCACCATCC 


TAAACCAGCT 


GGCCAAAACC 


TGCCCGCCGG 


32040 


CTATACACTG 


CAGGGAACCG 


GGACTGGAAC 


AATGACAGTG 


GAGAGCCCAG 


GACTCGTAAC 


32100 


CATGGATCAT 


CATGCTCGTC 


ATGATATCAA 


TGTTGGCACA 


ACACAGGCAC 


ACGTGCATAC 


32160 


ACTTCCTCAG 


GATTACAAGC 


TCCTCCCGCG 


TTAGAACCAT 


ATCCCAGGGA 


ACAACCCATT 


32220 


CCTGAATCAG 


CGTAAATCCC 


ACACTGCAGG 


GAAGACCTCG 


CACGTAACTC 


ACGTTGTGCA 


32280 


TTGTCAAAGT 


GTTACATTCG 


GGCAGCAGCG 


GATGATCCTC 


CAGTATGGTA 


GCGCGGGTTT 


32340 


CTGTCTCAAA 


AGGAGGTAGA 


CGATCCCTAC 


TGTACGGAGT 


GCGCCGAGAC 


AACCGAGATC 


32400 


GTGTTGGTCG 


TAGTGTCATG 


CCAAATGGAA 


CGCCGGACGT 


AGTCATATTT 


CCTGAAGCAA 


32460 


AACCAGGTGC 


GGGCGTGACA 


AACAGATCTG 


CGTCTCCGGT 


CTCGCCGCTT 


AGATCGCTCT 


32520 


GTGTAGTAGT 


TGTAGTATAl 


CCACTCTCTC 


AAAGCATCCA 


GGCGCCCCCT 


GGCTTCGGGT 


32580 
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TCTATGTAAA 


CTCCTTCATG 


CGCCGCTGCC 


CTGATAACAT 


CCACCACCGC 


AGAATAAGCC 


32640 


ACACCCAGCC 


AACCTACACA 


TTCGTTCTGC 


GAGTCACACA 


CGGGAGGAGC 


GGG AAG AG CT 


32700 


GGAAGAACCA 


TGTTTTTTTT 


TTTATTCCAA 


AAGATTATCC 


AAAACCTCAA 


AATGAAGATC 


32760 


TATTAAGTGA 


ACGCGCTCCC 


CTCCGGTGGC 


GTGGTCAAAC 


TCTACAGCCA 


AAGAACAGAT 


32820 


AATGGCATTT 


GTAAGATGTT 


GCACAATGGC 


TTCCAAAAGG 


CAAACGGCCC 


TCACGTCCAA 


32880 


GTGGACGTAA 


AGGCTAAACC 


CTTCAGGGTG 


AATCTCCTCT 


ATAAACATTC 


CAGCACCTTC 


32940 


AACCATGCCC 


AAATAATTCT 


CATCTCGCCA 


CCTTCTCAAT 


ATATCTCTAA 


GCAAATCCCG 


33000 


AATATTAAGT 


CCGGCCATTG 


TAAAAATCTG 


CTCCAGAGCG 


CCCTCCACCT 


TCAGCCTCAA 


33060 


GCAGCGAATC 


ATGATTGCAA 


AAATTCAGGT 


TCCTCACAGA 


CCTGTATAAG 


ATTCAAAAGC 


33120 


GGAACATTAA 


CAAAAATACC 


GCGATCCCGT 


AGGTCCCTTC 


GCAGGGCCAG 


CTGAACATAA 


33180 


TCGTGCAGGT 


CTGCACGGAC 


CAGCGCGGCC 


ACTTCCCCGC 


CAGGAACCTT 


GACAAAAGAA 


33240 


CCCACACTGA 


TTATGACACG 


CATACTCGGA 


GCTATGCTAA 


CCAGCGTAGC 


CCCGATGTAA 


33300 


GCTTTGTTGC 


ATGGGCGGCG 


ATATAAAATG 


CAAGGTGCTG 


CTCAAAAAAT 


CAGGCAAAGC 


33360 


CTCGCGCAAA 


AAAGAAAGCA 


CATCGTAGTC 


ATGCTCATGC 


AGATAAAGGC 


AGGTAAGCTC 


33420 


CGGAACCACC 


ACAGAAAAAG 


ACACCATTTT 


TCTCTCAAAC 


ATGTCTGCGG 


GTTTCTGCAT 


33480 


AAACACAAAA 


TAAAATAACA 


AAAAAACATT 


TAAACATTAG 


AAGCCTGTCT 


TACAACAGGA 


33540 


AAAACAACCC 


TTATAAGCAT 


AAGACGGACT 


ACGGCCATGC 


CGGCGTGACC 


GTAAAAAAAC 


33600 


TGGTCACCGT 


GATTAAAAAG 


CACCACCGAC 


AGCTCCTCGG 


TCATGTCCGG 


AGTCATAATG 


33660 


TAAGACTCGG 


TAAACACATC 


AGGTTGATTC 


ATCGGTCAGT 


GCTAAAAAGC 


GACCGAAATA 


33720 


GCCCGGGGGA 


ATACATACCC 


GCAGGCGTAG 


AGACAACATT 


ACAGCCCCCA 


TAGGAGGTAT 


33780 


AACAAAATTA 


ATAGGAGAGA 


AAAACACATA 


AACACCTGAA 


AAACCCTCCT 


GCCTAGGCAA 


33840 


AATAGCACCC 


TCCCGCTCCA 


GAACAACATA 


CAGCGCTTCA 


CAGCGGCAGC 


CTAACAGTCA 


33900 


GCCTTACCAG 


TAAAAAAGAA 


AACCTATTAA 


AAAAACACCA 


CTCGACACGG 


CACCAGCTCA 


33960 


ATCAGTCACA 


GTGTAAAAAA 


GGGCCAAGTG 


CAGAGCGAGT 


ATATATAGGA 


CTAAAAAATG 


34020 


ACGTAACGGT 


TAAAGTCCAC 


AAAAAACACC 


CAGAAAACCG 


CACGCGAACC 


TACGCCCAGA 


34080 




AAAAAACCCA 


CAACTTCCTC 


AAATCGTCAC 


TTCCGTTTTC 


CCACGTTACG 


34140 


TAACTTCCCA 


TTTTAAGAAA 


ACTACAATTC 


CCAACACATA 


CAAGTTACTC 


CGCCCTAAAA 


34200 


CCTACGTCAC 


CCGCCCCGTT 


CCCACGCCCC 


GCGCCACGTC 


ACAAACTCCA 


CCCCCTCATT 


34260 


ATCATATTGG 


CTTCAATCCA 


AAATAAGGTA 


TATTATTGAT 


GAT 




34303 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
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CCAGGCTTTA CACTTTATGC TTCCGGCTCG TATGTTGTGT GGAATTGTGA GCGGATAACA 6 0 

ATTTCACACA GGAAACAGCT ATGACCATGA TTACGCCAAG CGCGCAATTA ACCCTCACTA 12 0 

AAGGGAACAA AAGCTGGGTA CCGGGCCCCC CCTCGAGGTC GACGGTATCG ATAAGCTTAC 180 

GCGTGGCCTA GGCGGCCGAA TTCCTGCAGC CCGGGGGATC CACTAGTTCT AGAGCGGCCG 24 0 

CCACCGCGGC GCCTTAATTA ATACGTAAGC TCCAATTCGC CCTATAGTGA GTCGTATTAC 3 00 

GCGCGCTCAC TGGCCGTCGT TTTACAACGT CGTGACTGGG AAAACCCTGG CGTTACCCAA 3 60 

CTTAATCGCC TTGCAGCACA 3 90 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

TGCCGCAGCA CCGGATGCAT C 21 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

( C } STRANDEDNESS : s ingle 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(XI ) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

GCGTCCGGAG GCTGCCATGC GGCAGGG 2 7 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1481 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc » "DNA" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCTGCCATG CGGCAGGGAT ACGGCGCTAA CGATGCATCT CAACAATTGT TGTGTAGGTA 6 0 

CTCCGCCGCC GAGGGACCTG AGCGAGTCCG CATCGACCGG ATCGGAAAAC CTCTCGAGAA 12 0 

AGGCGTCTAA CCAGTCACAG TCGCAAGGTA GGCTGAGCAC CGTGGCGGGC GGCAGCGGGC 18 0 

GGCGGTCGGG GTTGTTTCTG GCGGAGGTGC TGCTGATGAT GTAATTAAAG TAGGCGGTCT 24 0 

TGAGACGGCG GATGGTCGAC AGAAGCACCA TGTCCTTGGG TCCGGCCTGC TGAATGCGCA 3 00 

GGCGGTCGGC CATGCCCCAG GCTTCGTTTT GACATCGGCG CAGGTCTTTG TAGTAGTCTT 36 0 

GCATGAGCCT TTCTACCGGC ACTTCTTCTT CTCCTTCCTC TTGTCCTGCA TCTCTTGCAT 42 0 



- 104 - 



wo 98/17783 PCT/US97/19541 



CTATCGCTGC 


GGCGGCGGCG 


GAGTTTGGCC 


GTAGGTGGCG 


CCCTCTTCCT 


CCCATGCGTG 


480 


TGACCCCGAA 


GCCCCTCATC 


GGCTGAAGCA 


GGGCTAGGTC 


GGCGACAACG 


CGCTCGGCTA 


540 


ATATGGCCTG 


CTGCACCTGC 


GTGAGGGTAG 


ACTGGAAGTC 


ATCCATGTCC 


ACAAAGCGGT 


600 


GGTATGCGCC 


CGTGTTGATG 


GTGTAAGTGC 


AGTTGGCCAT 


AACGGACCAG 


TTAACGGTCT 


660 


GGTGACCCGG 


CTGCGAGAGC 


TCGGTGTACC 


TGAGACGCGA 


GTAAGCCCTC 


GAGTCAAATA 


720 


CGTAGTCGTT 


GCAAGTCCGC 


ACCAGGTACT 


GGTATCCCAC 


CAAAAAGTGC 


GGCGGCGGCT 


780 


GGCGGTAGAG 


GGGCCAGCGT 


AGGGTGGCCG 


GGGCTCCGGG 


GGCGAGATCT 


TCCAACATAA 


840 


GGCGATGATA 


TCCGTAGATG 


TACCTGGACA 


TCCAGGTGAT 


GGCGGCGGCG 


GTGGTGGAGG 


900 


CGCGCGGAAA 


GTCGCGGACG 


CGGTTCCAGA 


TGTTGCGCAG 


CGGCAAAAAG 


TGCTCCATGG 


960 


TCGGGACGCT 


CTGGCCGGTC 


AGGCGCGCGC 


AATCGTTGAC 


GCTCTACCGT 


GCAAAAGGAG 


1020 


AGCCTGTAAG 


CGGGCACTCT 


TCCGTGGTCT 


GGTGGATAAA 


TTCGCAAGGG 


TATCATGGCG 


1080 


GACGACCGGG 


GTTCGAGCCC 


CGTATCCGGC 


CGTCCGCCGT 


GATCCATGCG 


GTTACCGCCC 


1140 


GCGTGTCGAA 


CCCAGGTGTG 


CGACGTCAGA 


CAACGGGGGA 


GTGCTCCTTT 


TGGCTTCCTT 


1200 


CCAGGCGCGG 


CGGCTGCTGC 


GCTAGCTTTT 


TTGGCCACTG 


GCCGCGCGCA 


GCGTAAGCGG 


1260 


TTAGGCTGGA 


AAGCGAAAGC 


ATTAAGTGGC 


TCGCTCCCTG 


TAGCCGGAGG 


GTTATTTTCC 


1320 


AAGGGTTGAG 


TCGCGGGACC 


CCCGGTTCGA 


GTCTCGGACC 


GGCCGGACTG 


CGGCGAACGG 


1380 


GGGTTTGCCT 


CCCCGTCATG 


CAAGACCCCG 


CTTGCAAATT 


CCTCCGGAAA 


CAGGGACGAG 


1440 


CCCCTTTTTT 


GCTTTTCCCA 


GATGCATCCG 


GTGCTGCGGC 


A 




1481 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3364 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

GAATTCCCCA TCCTGGTCTA TAGAGAGAGT TCCAGAACAG CCAGGGCTAC AGATAAACCC 6 0 

ATCTGGAAAA ACAAAGTTGA ATGACCCAAG AGGGGTTCTC AGAGGGTGGC GTGTGCTCCC 12 0 

TGGCAAGCCT ATGACATGGC CGGGGCCTGC CTCTCTCTGC CTCTGACCCT CAGTGGCTCC 18 0 

CATGAACTCC TTGCCCAATG GCATCTTTTT CCTGCGCTCC TTGGGTTATT CCAGTCTCCC 24 0 

CTCAGCATTC CTTCCTCAGG GCCTCGCTCT TCTCTCTGCT CCCTCCTTGC ACAGCTGGCT 3 00 

CTGTCCACCT CAGATGTCAC AGTGCTCTCT CAGAGGAGGA AGGCACCATG TACCCTCTGT 3 60 

TTCCCAGGTA AGGGTTCAAT TTTTAAAAAT GGTTTTTTGT TTGTTTGTTT GTTTGTTTGT 42 0 

TTGTTTGTTT TTCAAGACAG GGCTCCTCTG TGTAGTCCTA ACTGTCTTGA AACTCCCTCT 480 

GTAGACCAGG TCGACCTCGA ACTCTTGAAA CCTGCCACGG ACCACCCAGT CAGGTATGGA 54 0 

GGTCCCTGGA ATGAGCGTCC TCGAAGCTAG GTGGGTAAGG GTTCGGCGGT GACAAACAGA 6 00 

AACAAACACA GAGGCAGTTT GAATCTGAGT GTATTTTGCA GCTCTCAAGC AGGGGATTTT 66 0 
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ATACATAAAA 


AAAAAAAAAA 


AAAAAAAACC 


AAACATTACA 


TCTCTTAGAA 


ACTATATCCA 


720 


ATGAAACAAT 


CACAGATACC 


AACCAAAACC 


ATTGGGCAGA 


GTAAAGCACA 


AAAATCATCC 


780 


AAGCATTACA 


ACTCTGAAAC 


CATGTATTCA 


GTGAATCACA 


AACAGAACAG 


GTAACATCAT 


840 


TATTAATATA 


AATCACCAAA 


ATATAACAAT 


TCTAAAAGGA 


TGTATCCAGT 


GGGGGCTGTC 


900 


GTCCAAGGCT 


AGTGGCAGAT 


TTCCAGGAGC 


AGGTTAGTAA 


ATCTTAACCA 


CTGAACTAAC 


960 


TCTCCAGCCC 


CATGGTCAAT 


TATTATTTAG 


CATCTAGTGC 


CTAATTTTTT 


TTTATAAATC 


1020 


TTCACTATGT 


AATTTAAAAC 


TATTTTAATT 


CTTCCTAATT 


AAGGCTTTCT 


TTACCATATA 


1080 


CCAAAATTCA 


CCTCCAATGA 


CACACGCGTA 


GCCATATGAA 


ATTTTATTGT 


TGGGAAAATT 


1140 


TGTACCTATC 


ATAATAGTTT 


TGTAAATGAT 


TTAAAAAGCA 


AAGTGTTAGC 


CGGGCGTGGT 


1200 


GGCACACGCC 


TTTAATCCCT 


GCACTCGGGA 


GGCAGGGGCA 


GGAGGATTTC 


TGAGTTTGAG 


1260 


GCCAGCCTGG 


TCTACAGAGT 


GAGTTCCAGG 


ACAGCCAGGG 


CTACACAGAG 


AAACCCTGTC 


1320 


TCGAACCCCC 


CACCCCCCAA 


AAAAAGCAAA 


GTGTTGGTTT 


CCTTGGGGAT 


AAAGTCATGT 


1380 


TAGTGGCCCA 


TCTCTAGGCC 


CATCTCACCC 


ATTATTCTCG 


CTTAAGATCT 


TGGCCTAGGC 


1440 


TACCAGGAAC 


ATGTAAATAA 


GAAAAGGAAT 


AAGAGAAAAC 


AAAACAGAGA 


GATTGCCATG 


1500 


AGAACTACGG 


CTCAATATTT 


TTTCTCTCCG 


GCGAAGAGTT 


CCACAACCAT 


CTCCAGGAGG 


1560 


CCTCCACGTT 


TTGAGGTCAA 


TGGCCTCAGT 


CTGTGGAACT 


TGTCACACAG 


ATCTTACTGG 


1620 


AGGTGGTG7G 


GCAGAAACCC 


ATTCCTTTTA 


GTGTCTTGGG 


CTAAAAGTAA 


AAGGCCCAGA 


1680 


GGAGGCCTTT 


GCTCATCTGA 


CCATGCTGAC 


AAGGAACACG 


GGTGCCAGGA 


CAGAGGCTGG 


1740 


ACCCCAGGAA 


CACCTTAAAC 


ACTTCTTCCC 


TTCTCCGCCC 


CCTAGAGCAG 


GCTCCCCTCA 


1800 


CCAGCCTGGG 


CAGAAATGGG 


GG AAGATGGA 


GTGAAGCCAT 


ACTGGCTACT 


CCAGAATCAA 


1860 


CAGAGGGAGC 


CGGGGGCAAT 


ACTGGAGAAG 


CTGGTCTGCC 


CCCAGGGGCA 


ATCCTGGCAC 


1920 


CTCCCAGGCA 


GAAGAGGAAA 


CTTCCACAGT 


GCATCTCACT 


TCCATGAATC 


CCCTCCTCGG 


1980 


ACTCTGAGGT 


CCTTGGTCAC 


AGCTGAGGTG 


CAAAAGGCTC 


CTGTCATATT 


GTGTCCTGCT 


2040 


CTGGTCTGCC 


TTCCACAGCT 


TGGGGGCCAC 


CTAGCCCACC 


TCTCCCTAGG 


GATGAGAGCA 


2100 


GCCACTACGG 


GTCTAGGCTG 


CCCATGTAAG 


GAGGCAAGGC 


CTGGGGACAC 


CCGAGATGCC 


2160 


TGGTTATAAT 


TAACCCAGAC 


ATGTGGCTGC 


CCCCCCCCCC 


CCAACACCTG 


CTGCCTGAGC 


2220 


CTCACCCCCA 


CCCCGGTGCC 


TGGGTCTTAG 


GCTCTGTACA 


CCATGGAGGA 


GAAGCTCGCT 


2280 


CTAAAAATAA 


CCCTGTCCCT 


GGTGGATCCA 


GGGTGAGGGG 


CAGGCTGAGG 


GCGGCCACTT 


2340 


CCCTCAGCCG 


CAGGTTTGTT 


TTCCCAAGAA 


TGGTTTTTCT 


GCTTCTGTAG 


CTTTTCCTGT 


2400 


CAATTCTGCC 


ATGGTGGAGC 


AGCCTGCACT 


GGGCTTCTGG 


GAGAAACCAA 


ACCGGGTTCT 


2460 


AACCTTTCAG 


CTACAGTTAT 


TGCCTTTCCT 


GTAGATGGGC 


GACTACAGCC 


CCACCCCCAC 


2520 


CCCCGTCTCC 


TGTATCCTTC 


CTGGGCCTGG 


GGATCCTAGG 


CTTTCACTGG 


AAATTTCCCC 


2580 


CCAGGTGCTG 


TAGGCTAGAG 


TCACGGCTCC 


CAAGAACAGT 


GCTTGCCTGG 


CATGCATGGT 


2640 


TCTGAACCTC 


CAACTGCAAA 


AAATGACACA 


TACCTTGACC 


CTTGGAAGGC 


TGAGGCAGGG 


2700 


GGATTGCCAT 


GAGTGCAAAG 


CCAGACTGGG 


TGGCATAGTT 


AGACCCTGTC 


TCAAAAAACC 


2760 
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AAAAACAATT AAATAACTAA AGTCAGGCAA GTAATCCTAC TCGGGAGACT GAGGCAGAGG 282 0 

GATTGTTACA TGTCTGAGGC CAGCCTGGAC TACATAGGGT TTCAGGCTAG CCCTGTCTAC 2 88 0 

AGAGTAAGGC CCTATTTCAA AAACACAAAC AAAATGGTTC TCCCAGCTGC TAATGCTCAC 2 94 0 

CAGGCAATGA AGCCTGGTGA GCATTAGCAA TGAAGGCAAT GAAGGAGGGT GCTGGCTACA 3 000 

ATCAAGGCTG TGGGGGACTG AGGGCAGGCT GTAACAGGCT TGGGGGCCAG GGCTTATACG 3 06 0 

TGCCTGGGAC TCCCAAAGTA TTACTGTTCC ATGTTCCCGG CGAAGGGCCA GCTGTCCCCC 312 0 

GCCAGCTAGA CTCAGCACTT AGTTTAGGAA CCAGTGAGCA AGTCAGCCCT TGGGGCAGCC 318 0 

CATACAAGGC CATGGGGCTG GGCAAGCTGC ACGCCTGGGT CCGGGGTGGG CACGGTGCCC 32 4 0 

GGGCAACGAG CTGAAAGCTC ATCTGCTCTC AGGGGCCCCT CCCTGGGGAC AGCCCCTCCT 33 00 

GGCTAGTCAC ACCCTGTAGG CTCCTCTATA TAACCCAGGG GCACAGGGGC TGCCCCCGGG 3360 
^^^^ 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGTACCACTA CGGGTCTAGG CTGCCCATGT AAGGAGGCAA GGCCTGGGGA CACCCGAGAT 6 0 

GCCTGGTTAT AATTAACCCC AACACCTGCT GCCCCCCCCC CCCCAACACC TGCTGCCTGA 12 0 

GCCTGAGCGG TTACCCCACC CCGGTGCCTG GGTCTTAGGC TCTGTACACC ATGG 174 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 699 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AGATCTTCCT AGAAATTCGC TGTCTGCGAG GGCCGGCTGT TGGGGTGAGT ACTCCCTCTC 6 0 

AAAAGCGGGC ATGACTTCTG CGCTAAGATT GTCAGTTTCC AAAAACGAGG AGGATTTGAT 12 0 

ATTCACCTGG CCCGCGGTGA TGCCTTTGAG GGTGGCCGCG TCCATCTGGT CAGAAAAGAC 180 

AATCTTTTTG TTGTCAAGCT TGAGGTGTGG CAGGCTTGAG ATCGATCTGG CCATACACTT 24 0 

GAGTGACAAT GACATCCACT TTGCCTTTCT CTCCACAGGT GTCCACTCCC AGGTCCAACC 3 00 

GCGGATCTCC CGGGACCATG CCCAAGAAGA AGAGGAAGGT GTCCAATTTA CTGACCGTAC 360 

ACCAAAATTT GCCTGCATTA CCGGTCGATG CAACGAGTGA TGAGGTTCGC AAGAACCTGA 420 

TGGACATGTT CAGGGATCGC CAGGCGTTTT CTGAGCATAC CTGGAAAATG CTTCTGTCCG 4 80 
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TTTGCCGGTC GTGGGCGGCA TGGTGCAAGT TGAATAACCG GAAATGGTTT CCCGCAGAAC 



540 



CTGAAGATGT TCGCGATTAT CTTCTATATC TTCAGGCGCG CGGTCTGGCA GTAAAAACTA 



600 



TCCAGCAACA TTTGGGCCAG CTAAACATGC TTCATCGTCG GTCCGGGCTG CCACGACCAA 



660 



GTGACAGCAA TGCTGTTTCA CTGGTTATGC GGCGGATCC 



699 



(2) INFORMATION FOR SEQ ID NO: 12; 

(i) SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 6 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GAAGATCTAT AACTTCGTAT AATGTATGCT ATACGAAGTT ATTACCGAAG AAATGGCTCG 6 0 

AGATCTTC 6 8 

(2) INF0Rf4ATI0N FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAAGATCTCG AGCCATTTCT TCGGTAATAA CTTCGTATAG CATACATTAT ACGAAGTTAT 6 0 

AGATCTTC 6 8 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCACATGTAT AACTTCGTAT AGCATACATT ATACGAAGTT ATACATGTGG 50 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

(XI ) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CCACATGTAT AACTTCGTAT AATGTATGCT ATACGAAGTT ATACATGTGG 50 
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1 . A recombinant plasmid. comprising in operable combination: 

a) a plasmid backbone, comprising an origin of replication, an 
antibiotic resistance gene and a eukaryotic promoter element: 
5 b) the left and right inverted terminal repeats (ITRs) of 

adenovirus, said ITRs each having a 5' and a 3' end and arranged in a tail 
to tail orientation on said plasmid backbone; 

c) the adenovirus packaging sequence, said packaging sequence 
having a 5' and a 3' end and linked to one of said ITRs; and 
10 d) a first gene of interest operably linked to said promoter 

element. 

2. The recombinant plasmid of Claim 1 wherein the total size of said 
plasmid is between 27 and 40 kilobase pairs. 

3. The recombinant plasmid of Claim I. wherein said 5* end of said 
15 packaging sequence is linked to said 3' end of said left ITR. 

4. The recombinant plasmid of Claim 3, wherein said first gene of 
interest is linked to said 3' end of said packaging sequence. 

5. The recombinant plasmid of Claim 4, wherein said first gene of 
interest is the dystrophin cDNA gene, 

20 6. The recombinant plasmid of Claim 4, further comprising a second 

gene of interest. 

7. The recombinant plasmid of Claim 6. wherein said second gene of 
interest is linked to said 3' end of said right ITR. 

8. The recombinant plasmid of Claim 7, wherein said second gene of 
25 interest is a reporter gene. 
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9. The recombinant plasmid of Claim 8. wherein said reporter gene is 
selected Irom the group consisting of the £. coli P-galactosidase gene, the human 
placental alkaline phosphatase gene, the green tluroescent protein gene and the 
choramphenicol acetyltransferase gene. 

5 10. A mammalian cell line containing the recombinant plasmid of 

Claim 1. 

11. The cell line of Claim 10, wherein said cell line is a 293-derived cell 

line. 

12. A helper adenovirus comprising i) first and a second loxP sequences, 
10 and ii) the adenovirus packaging sequence, said packaging sequence having a 5" 

and a 3" end. 

13. The helper adenovirus of Claim 12, wherein said first loxP sequence 
is linked to the 5* end of said packaging sequence and said second loxP sequence is 
linked to said 3' end of said packaging sequence. 

15 14. The helper adenovirus of Claim 12, further comprising at least one 

adenovirus gene coding region. 

15. A mammalian cell line, comprising: 

a) a recombinant plasmid. comprising, in operable combination; 

i) a plasmid backbone, comprising an origin of 

20 replication, an antibiotic resistance gene and a eukaryotic promoter 

element, 

ii) the left and right inverted terminal repeats (ITRs) of 
adenovirus, said ITRs each having a 5' and a 3' end and arranged in 
a tail to tail orientation on said plasmid backbone. 

25 iii) the adenovirus packaging sequence, said packaging 

sequence having a 5' and a 3* end and linked to one of said ITRs, 
and 
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iv) a first gene of interest operabiy linked to said 
promoter element: and 

b) a helper adenovirus comprising i) first and a second loxP 
sequences, and ii) the adenovirus packaging sequence, said packaging 
5 sequence having a 5* and a 3' end. 



16. The cell line of Claim 15, wherein said helper adenovirus further 
comprises at least one adenovirus gene coding region. 

17. The cell line of Claim 15, wherein said first gene of interest is the 
dystrophin cDNA gene 

10 18. A method of producing an adenovirus minichromosome. comprising: 

A) providing a mammalian cell line containing: 

a) a recombinant plasmid, comprising, in operable 
combination, i) a plasmid backbone, comprising an origin of 
replication, an antibiotic resistance gene and a eukarotic promoter 

15 element, ii) the left and right inverted terminal repeats (ITRs) of 

adenovirus, said ITRs each having a 5' and a 3' end and arranged in 
a tail to tail orientation on said plasmid backbone, iii) the adenovirus 
packaging sequence, said packaging sequence having a 5" and a 3' 
end and linked to one of said ITRs, and iv) a first gene of interest 

20 operabiy linked to said promoter element; and 

b) a helper adenovirus comprising i) first and a second 
loxP sequences, ii) at least one adenovirus gene coding region, and 
iii) the adenovirus packaging sequence, said packaging sequence 
having a 5' and a 3' end; and 

-5 B) growing said cell line under conditions such that said 

adenovirus gene coding region is expressed and said recombinant plasmid 
directs the production of at least one adenoviral minichromosome. 



19. The method of Claim 18. wherein said adenovirus minichromosome 
is encapidated. 



- Ill - 



wo 98/17783 PCT/US97/19541 

20. The method of Claim 18. further comprising 3) recovering said 

encapsidated adenovirus minichromosome. 



21. The method of Claim 20. further comprising 4) purifiying said 
recovered encapsidated adenovirus minichromosome. 

5 22. The method of Claim 2L further comprisng 5) administering said 

purified encapsidated adenovirus minichromosome to a host. 

23. The method of Claim 22, wherein said host is a mammal, 

24. The method of Claim 23, wherein said mammal is a human. 

25. A recombinant adenovirus comprising the adenovirus E2b region 
10 having a deletion, said adenovirus capable of self-propagation in a packaging cell 

line and said E2b region comprising the DNA polymerase gene and the adenovirus 
preterminal protein gene. 

26. The recombinant adenovirus of Claim 25, wherein said deletion is 
within the adenovirus DNA polymerase gene. 

15 27. The recombinant adenovirus of Claim 25. wherein said deletion is 

within the adenovirus preterminal protein gene. 

28. The recombinant adenovirus of Claim 25, wherein said deletion is 
within the adenovirus DNA polymerase and preterminal protein genes. 

29. A mammalian ceil line stably and constitutively expressing the 
20 adenovirus El gene products and the adenovirus DNA polymerase. 

30. The cell line of Claim 29. wherein said cell line comprises a 
recombinant adenovirus comprising a deletion within the E2b region, said 
recombinant adenovirus being capable of self-propagation in said cell line. 
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31. The cell line of Claim 30. wherein said deletion within said E2b 
region comprises a deletion within the adenoviral DNA polymerase gene. 



32. The cell of Claim 29. wherein the genome of said cell line contains 
a nucleotide sequence encoding adenovirus DNA polymerase operably linked to a 

5 heterologous promoter. 

33. The cell line of Claim 32. wherein said cell line is selected from the 
group consisting of the B-6, B-9, C-K C-4, C-7, C-13, and C-14 cell lines. 

34. The cell line of Claim 29 further constitutively expressing the 
adenovirus preterminal protein gene product. 

10 35. The cell line of Claim 34, wherein said cell line comprises a 

recombinant adenovirus comprising a deletion within the E2b region, said 
recombinant adenovirus being capable of self-propagation in said cell line. 

36. The cell line of Claim 35, wherein said deletion within said E2b 
region comprises a deletion within the adenoviral preterminal protein gene. 

15 37. The cell line of Claim 36, wherein said deletion within said E2b 

region comprises a deletion within the adenoviral DNA polymerase and preterminal 
protein genes. 

38. The cell line of Claim 34, wherein the genome of said cell line 
contains a nucleotide sequence encoding adenovirus preterminal protein operably 

20 linked to a heterologous promoter. 

39. The cell line of Claim 38. wherein said cell line is selected from the 
group consisting of the C-1, C-4. C-7. C- 13, and C-14 cell lines. 
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40. A method of producing infectious recombinant adenovirus particles 
containing an adenoviral genome containing a deletion within the E2b region, 
comprising 

a) providing: 

5 i) a mammalian cell line stably and constitutivelv 

expressing the adenovirus El gene products and the adenovirus DNA 
polymerase: 

ii) a recombinant adenovirus comprising a deletion within 
the E2b region, said recombinant adenovirus being capable of self- 
10 propagation in said cell line; 

b) introducing said recombinant adenovirus into said cell line 
under conditions such that said recombinant adenovirus is propagated to 
form infectious recombinant adenovirus particles; and 

c) recovering said infectious recombinant adenovirus particles. 

15 41. The method of Claim 40, further comprising d) purifying said 

recovered infectious recombinant adenovirus particles. 

42. The method of Claim 41, further comprising c) administering said 
purified recombinant adenovirus particles to a host. 

43. The method of Claim 40, wherein said mammalian cell line further 
20 constitutively expresses the adenovirus preterminal protein. 

44. A recombinant plasmid capable of replicating in a bacterial host 
comprising adenoviral E2b sequences, said E2b sequences containing a deletion 
within the polymerase gene, said deletion resulting in reduced polymerase activity. 

45. The recombinant plasmid of Claim 44. wherein said deletion 
25 comprises a deletion of nucleotides 8772 to 9385 in SEQ ID N0:4. 

46. The recombinant plasmid of Claim 45. wherein said plasmid has the 
designation pApol. 
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47. The recombinant piasmid of Claim 45. wherein said plasmid has the 
designation pBHGI lApol. 



48. A recombinant plasmid capable of replicating in a bacterial host 
comprising adenoviral E2b sequences, said E2b sequences containing a deletion 

5 within the preterminal protein gene, said deletion resulting in the inability to 

express functional preterminal protein without disruption of the VA RNA genes. 

49. The recombinant plasmid of Claim 48, wherein said deletion within 
said preterminal protein gene comprises a deletion of nucleotides 10 JOS to 11, 134 
in SEQ ID N0:4. 

10 50. The recombinant plasmid of Claim 49. wherein said plasmid has the 

designation pApTP. 

51. The recombinant plasmid of Claim 49, wherein said plasmid has the 
designation pBHGllApTP. 

52. The recombinant plasmid of Claim 48 further comprising a deletion 
15 within the polymerase gene, said deletion resulting in reduced polymerase activity. 

53. The recombinant plasmid of Claim 52, wherein said deletion within 
said polymerase and said preterminal protein genes comprises a deletion of 
nucleotides 8,773 to 9586 and 11,067 to 12,513 in SEQ ID N0:4. 

54. The recombinant plasmid of Claim 53, wherein said plasmid has the 
20 designation pAXBApolApTPVARNA+tI3. 

55. The recombinant plasmid of Claim 53, wherein said plasmid has the 
designation pBHGl 1 ApolApTPVARNA+tl3. 
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